用ProxyCurl，Python程序和NODEJS刮擦LinkedIn数据-DEV365 开发者社区

今天，我想向您展示如何使用ProxyCurl API，Python编程和Nodejs从LinkedIn刮擦数据。

让我们使用Python编程和库请求刮擦数据。

我将使用 ProxyCurl Company api get the Employee Count Endpoint

安装软件包请求

!pip install requests

让我们让我们的ProxyCurl API使用 Proxycurl创建一个帐户并生成您的API。

让我们计算在Apple.inc.inc

的员工人数

使用库

import requests

api_endpoint ='https://nubela.co/proxycurl/api/linkedin/company/employees/count/'

api_key = 'YOUR_API_KEY_HERE'

header_dic = {'Authorization': 'Bearer ' + api_key}

params = {
    'linkedin_employee_count': 'include',
    'employment_status': 'current',
    'url': 'https://www.linkedin.com/company/apple/',
}

response = requests.get(api_endpoint,
                        params=params,
                        headers=header_dic)

输出响应是：
{ 'total_employee': 94262, 'linkedin_employee_count': 567686, 'linkdb_employee_count': 94262 }

让我们尝试计算在Twitter上工作的员工人数

import requests

api_endpoint = 'https://nubela.co/proxycurl/api/linkedin/company/employees/count/'
api_key = '3HqZGXdoejPB8YYT4KRb3w'
header_dic = {'Authorization': 'Bearer ' + api_key}
params = {
    'linkedin_employee_count': 'include',
    'employment_status': 'current',
    'url': 'https://www.linkedin.com/company/twitter/',
}
response = requests.get(api_endpoint,
                        params=params,
                        headers=header_dic)

输出为

{'total_employee': 7472, 'linkedin_employee_count': 7992, 'linkdb_employee_count': 7472 }

您可以与尽可能多的公司一起尝试

接下来，让我们尝试使用ProxyCurl和Nodejs

从LinkedIn刮除数据

cd c:\\User\user\Folder name

构建文件包

npm install express axios dotenv

or with Yarn

yarn add express axios dotenv

从proxycurl生成API键

API_KEY = 'YOUR_API_KEY_HERE'

代码段

import express from 'express';
import axios from 'axios';
import dotenv from 'dotenv';

const app = express();

dotenv.config();

app.listen(8000, () => {
    console.log('App connected successfully!');
});

// Getting Company's job listing

const TWITTER_URL = 'https://www.linkedin.com/company/twitter/';  // Line 1

const COMPANY_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/company';

const JOBS_LISTING_ENDPOINT = 'https://nubela.co/proxycurl/api/v2/linkedin/company/job';

const JOB_PROFILE_ENDPOINT = 'https://nubela.co/proxycurl/api/linkedin/job';

const companyProfileConfig = {  // Line 2
    url: COMPANY_PROFILE_ENDPOINT,
    method: 'get',
    headers: {'Authorization': 'Bearer ' + process.env.API_KEY},
    params: {
    url: TWITTER_URL
  }
};

const getTwitterProfile = async () => {  // Line 3
    return await axios(companyProfileConfig);
}

const profile = await getTwitterProfile();

const twitterID = profile.data.search_id;

console.log('Twitter ID:', twitterID);


const jobListingsConfig = {
    url: JOBS_LISTING_ENDPOINT,
    method: 'get',
    headers: {'Authorization': 'Bearer ' + process.env.API_KEY},
    params: {
    search_id: twitterID // Line 4
    }
}

const getTwitterListings = async () => { // Line 5
     return await axios(jobListingsConfig);
}

const jobListings = await getTwitterListings();

const jobs = jobListings.data.job;

console.log(jobs);

// Specific Job listing code snippet

const jobProfileConfig = {
    url: JOB_PROFILE_ENDPOINT,
    method: 'get',
    headers: { 'Authorization': 'Bearer ' + process.env.API_KEY },
    params: {
        url: jobs[0].job_url   // Line 1
    }
};

const getJobDetails = async () => {  // Line 2
    return await axios(jobProfileConfig);
};

const jobDetails = await getJobDetails(); 

console.log(jobDetails.data);

package.json应该看起来像;

{
  "name": "nubela",
  "version": "1.0.0",
  "type": "module",
  "description": "",
  "main": "proxycurl.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "axios": "^1.1.3",
    "dotenv": "^16.0.3",
    "express": "^4.18.2"
  }
}

您可以尝试使用ProxyCurl API

从LinkedIn刮除您选择的任何数据

参考
Proxycurl API
Proxycurl Documentation
Node js
Proxycurl Writer