file_get_contents无法使用utf8


镜乃Kagamino
2025-03-18 02:49:45 (6天前)


我正试图从网站上获取泰国字符。我试过了:

$ rawChapter = file_get_contents(“URL”);
$ rawChapter = mb_convert_encoding($ rawChapter,‘UTF-8’,mb_detect_encoding($ rawChapter,’UTF-8,I

2 条回复
  1. 0# 荀彧. | 2019-08-31 10-32



    改变你的

    Accept-Charset



    UTF-8

    因为ISO-8859-1不支持泰语字符。如果您在Windows机器上运行PHP脚本,您也可以使用

    windows-874

    charset,你也可以尝试添加这个标题:




    1. Content-Language: th

    2. </code>


    但在大多数情况下,UTF-8将处理几乎大多数字符或字符集而无需任何其他声明。





    UPDATE
    </强>



    很奇怪,但这对我有用。




    1. $opts = array(
      http’=>array(
      method’=>”GET”,
      header’=> implode(“\r\n”, array(
      Content-type: text/plain; charset=TIS-620
      //‘Content-type: text/plain; charset=windows-874’ // same thing
      ))
      )
      );

    2. $context = stream_context_create($opts);

    3. //$fp = fopen(‘http://thaipope.org/webbible/01_002.htm‘, ‘rb’, false, $context);
      //$contents = stream_get_contents($fp);
      //fclose($fp);
      $contents = file_get_contents(“http://thaipope.org/webbible/01_002.htm",false, $context);

    4. header(‘Content-type: text/html; charset=TIS-620’);
      //header(‘Content-type: text/html; charset=windows-874’); // same thing

    5. echo $contents;

    6. </code>


    显然,我对这个关于UTF-8的错误。看到

    这里

    更多细节。虽然你仍然可以有UTF-8输出:




    1. $in_charset = TIS-620’; // == ‘windows-874’
      $out_charset = utf-8’;

    2. $opts = array(
      http’=>array(
      method’=>”GET”,
      header’=> implode(“\r\n”, array(
      Content-type: text/plain; charset=’ . $in_charset
      ))
      )
      );

    3. $context = stream_context_create($opts);

    4. $contents = file_get_contents(“http://thaipope.org/webbible/01_002.htm",false, $context);
      if ($in_charset != $out_charset) {
      $contents = iconv($in_charset, $out_charset, $contents);
      }

    5. header(‘Content-type: text/html; charset=’ . $out_charset);

    6. echo $contents; // output in UTF-8

    7. </code>

登录 后才能参与评论