admin 发表于 2024-11-11 14:46:43

使用 PHP Curl 获取网页内容时遇到动态问题,寻求解决方案。

感谢您的回复!
我遇到了使用PHP获取今日头条视频页面完整内容的问题,因为网页内容一直在跳动。我理解这可能是由于使用了JavaScript动态创建内容。请问有技术或方法可以获取完整内容吗?
以下是相关信息:
- 视频演示:https://thumbsnap.com/qkDgKM5o
- 网址信息:https://www.toutiao.com/video/7418557232318513703/
我尝试了模拟浏览器请求,但仍然无法获取完整内容。我使用以下代码,希望您能提供一些建议:
```php
function toutiao_html_header($url) {
$header = array(
    "Host: https://www.toutiao.com",
    "Referer: {$url}",
    "set-cookie: tt_webid=7421001700129736202; path=/; expires=Sun, 09 Feb 2025 04:22:07 GMT; domain=toutiao.com; secure; httponly",
    'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
    'Accept-Language:zh-CN,zh;q=0.9,en;q=0.8',
    'cookie: ******************************************************************************',
    "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36",
);
return $header;
}
function fetch_toutiao_video_html($url) {
$header = toutiao_html_header($url);
$timeout = 40;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_REFERER, $header);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_USERAGENT, $header);
$content = curl_exec($ch);
if (curl_errno($ch)) {
    echo 'Error:' . curl_error($ch);
} else {
    return $content;
}
curl_close($ch);
}
```
期待您的回复!
页: [1]
查看完整版本: 使用 PHP Curl 获取网页内容时遇到动态问题,寻求解决方案。