Web用nodejs刮擦Google作业列表
#node #webscraping #serpapi

Intro

在这篇博客文章中,我将向您展示如何使用Serpapi中的Google Jobs Listing Results API

我们API的主要优点是您不需要使用浏览器自动化来刮擦结果,从头开始创建解析器并维护它。

也有可能在Google的某个时候阻止请求,我们在后端处理它,因此无需弄清楚如何自己做或弄清楚要使用哪个验证码,代理提供商。

将被刮擦

what

完整代码

如果您不需要解释,请看一下the full code example in the online IDE

const SerpApi = require("google-search-results-nodejs");
const search = new SerpApi.GoogleSearch(process.env.API_KEY);

const jobId =
  "eyJqb2JfdGl0bGUiOiJCYXJpc3RhIiwiY29tcGFueV9uYW1lIjoiQmx1ZSBCb3R0bGUgQ29mZmVlIiwiY29tcGFueV9taWQiOiIvbS8wM3AxMnFkIiwiYWRkcmVzc19jaXR5IjoiU2FuIEZyYW5jaXNjbyIsImFkZHJlc3Nfc3RhdGUiOiJDYWxpZm9ybmlhIiwiaHRpZG9jaWQiOiJBNnFRZGw1VjZvU1ZxbnI0QUFBQUFBPT0iLCJ1dWxlIjoidytDQUlRSUNJa1UyRnVJRVp5WVc1amFYTmpieUJDWVhrZ1FYSmxZU3hWYm1sMFpXUWdVM1JoZEdWeiIsImZjIjoiRW80QkNtY3dZVFZSY0RkWE9VTkJSek42WVdocVVHRkZXV2hqTkdGamJGSm5lRUpmWVhOc2NVRmlhRXhKWHpaek1Hb3hWamx2UWxKSlpGVnBVRUpyV1ZkeU1sQXpXRmhVVDB3MVRHbFNjamh5T0VzelRVcGhhM2x6WjFsQ1NuZFBWMjVJTlVKQlJWTjJOREZCRWhaME5rUXdXSE5pYjB4TmRWVXRaMVEwTkc5SFNVTkJHZ3RQTFVKSGRqaFNVWEp4TkEiLCJmY2VudiI6IjEiLCJmY3YiOiIyIiwiZmNfaWQiOiJmY183In0";

const params = {
  engine: "google_jobs_listing", // search engine
  q: jobId, // job id from https://serpapi.com/playground?engine=google_jobs&q=Barista
};

const getJson = () => {
  return new Promise((resolve) => {
    search.json(params, resolve);
  });
};

const getResults = async () => {
  const json = await getJson();
  const { apply_options = "No apply options for this job", salaries = "No salaries for this job", ratings = "No ratings for this job" } = json;
  return { applyOptions: apply_options, salaries, ratings };
};

getResults().then(console.log);

准备

首先,我们需要创建一个node.js* project并添加koude0 package koude1以刮擦和解析使用SerpApi

为此,在我们项目的目录中,打开命令行并输入:

$ npm init -y

,然后:

$ npm i google-search-results-nodejs

*如果您没有安装node.js,则可以download it from nodejs.org并遵循安装documentation

代码说明

首先,我们需要从koude1库中声明SerpApi,并使用SerpApi的API键定义新的search实例:

const SerpApi = require("google-search-results-nodejs");
const search = new SerpApi.GoogleSearch(API_KEY);

接下来,我们编写作业ID和提出请求的必要参数。您可以使用Web scraping Google Jobs organic results with Nodejs博客文章获得工作ID参数:

const jobId =
  "eyJqb2JfdGl0bGUiOiJCYXJpc3RhIiwiY29tcGFueV9uYW1lIjoiQmx1ZSBCb3R0bGUgQ29mZmVlIiwiY29tcGFueV9taWQiOiIvbS8wM3AxMnFkIiwiYWRkcmVzc19jaXR5IjoiU2FuIEZyYW5jaXNjbyIsImFkZHJlc3Nfc3RhdGUiOiJDYWxpZm9ybmlhIiwiaHRpZG9jaWQiOiJBNnFRZGw1VjZvU1ZxbnI0QUFBQUFBPT0iLCJ1dWxlIjoidytDQUlRSUNJa1UyRnVJRVp5WVc1amFYTmpieUJDWVhrZ1FYSmxZU3hWYm1sMFpXUWdVM1JoZEdWeiIsImZjIjoiRW80QkNtY3dZVFZSY0RkWE9VTkJSek42WVdocVVHRkZXV2hqTkdGamJGSm5lRUpmWVhOc2NVRmlhRXhKWHpaek1Hb3hWamx2UWxKSlpGVnBVRUpyV1ZkeU1sQXpXRmhVVDB3MVRHbFNjamh5T0VzelRVcGhhM2x6WjFsQ1NuZFBWMjVJTlVKQlJWTjJOREZCRWhaME5rUXdXSE5pYjB4TmRWVXRaMVEwTkc5SFNVTkJHZ3RQTFVKSGRqaFNVWEp4TkEiLCJmY2VudiI6IjEiLCJmY3YiOiIyIiwiZmNfaWQiOiJmY183In0";

const params = {
  engine: "google_jobs_listing", // search engine
  q: jobId, // job id from https://serpapi.com/playground?engine=google_jobs&q=Barista
};

接下来,我们从Serpapi库中包装搜索方法,以便进一步处理搜索结果:

const getJson = () => {
  return new Promise((resolve) => {
    search.json(params, resolve);
  });
};

最后,我们声明了从页面获取数据并返回的函数getResult

const getResults = async () => {
  ...
};

在此功能中,我们获得了来自页面的结果(getJson函数)。然后,我们destructure收到json,将default values设置为每个破坏的属性,并返回结果:

const json = await getJson();
const { 
    apply_options = "No apply options for this job",
    salaries = "No salaries for this job",
    ratings = "No ratings for this job"
    } = json;
return { applyOptions: apply_options, salaries, ratings };

之后,我们运行getResults函数并在控制台中打印所有接收的信息:

getResults().then(console.log);

输出

{
  "applyOptions": "No apply options for this job",
  "salaries": [
    {
      "job_title": "Barista",
      "link": "https://www.ziprecruiter.com/Salaries/Barista-Salary-in-San-Francisco,CA?utm_campaign=google_jobs_salary&utm_source=google_jobs_salary&utm_medium=organic",
      "source": "ZipRecruiter",
      "salary_from": 21000,
      "salary_to": 40000,
      "salary_currency": "$",
      "salary_periodicity": "year",
      "thumbnail": "https://serpapi.com/searches/6363e399a3f4ef79bb0ff373/images/f1d87c26c4968270a8e938f7cf10b5cdc1a6c6457fd1184e.png",
      "based_on": "Based on local employers"
    },
    {
      "job_title": "Coffee Barista",
      "link": "https://www.salary.com/research/salary/alternate/coffee-barista-salary/san-francisco-ca?utm_campaign=google_jobs_salary&utm_source=google_jobs_salary&utm_medium=organic",
      "source": "Salary.com",
      "salary_from": 27000,
      "salary_to": 36000,
      "salary_currency": "$",
      "salary_periodicity": "year",
      "thumbnail": "https://serpapi.com/searches/6363e399a3f4ef79bb0ff373/images/f1d87c26c496827075cfe00b6aeb887d3533b03d6f8540fc.png",
      "based_on": "Based on local employers"
    },
    {
      "job_title": "Barista",
      "link": "https://www.payscale.com/research/US/Job=Barista/Hourly_Rate/ae670208/San-Francisco-CA?utm_campaign=google_jobs_salary&utm_source=google_jobs_salary&utm_medium=organic",
      "source": "Payscale",
      "salary_from": 14,
      "salary_to": 20,
      "salary_currency": "$",
      "salary_periodicity": "hour",
      "thumbnail": "https://serpapi.com/searches/6363e399a3f4ef79bb0ff373/images/f1d87c26c4968270a20489154adfa8fce64f52ba237d9ed1.png",
      "based_on": "Based on local employers"
    }
  ],
  "ratings": [
    {
      "link": "https://www.glassdoor.com/Reviews/Blue-Bottle-Coffee-Reviews-E803867.htm?utm_campaign=google_jobs_reviews&utm_source=google_jobs_reviews&utm_medium=organic",
      "source": "Glassdoor",
      "rating": 3.6,
      "reviews": 303
    },
    {
      "link": "https://www.indeed.com/cmp/Blue-Bottle-Coffee-Company/reviews?utm_campaign=google_jobs_reviews&utm_source=google_jobs_reviews&utm_medium=organic",
      "source": "Indeed",
      "rating": 3.3,
      "reviews": 70
    },
    {
      "link": "https://www.comparably.com/companies/blue-bottle-coffee?utm_campaign=google_jobs_reviews&utm_source=google_jobs_reviews&utm_medium=organic",
      "source": "Comparably",
      "rating": 2.5,
      "reviews": 40
    }
  ]
}

链接

如果您想查看一些用serpapi制造的项目,write me a message


加入我们的Twitter | YouTube

添加一个Feature Requestð«或Bugð