您现在的位置是:主页 > news > 创意设计生活用品/惠州企业网站seo

创意设计生活用品/惠州企业网站seo

admin2025/5/7 12:49:04news

简介创意设计生活用品,惠州企业网站seo,推荐常州模板网站建设,手机黑客网站大全Python爬取携程机票代码实例 现在携程的页面是通过接口传递数据的,不能直接使用xpath进行解析,需要模拟调用接口的步骤 dcity是指出发地的城市编码 acity是指目的地的城市编码 其他参数是根据前端页面直接复制过来的,Mac 的Chrome浏览器如下…

创意设计生活用品,惠州企业网站seo,推荐常州模板网站建设,手机黑客网站大全Python爬取携程机票代码实例 现在携程的页面是通过接口传递数据的,不能直接使用xpath进行解析,需要模拟调用接口的步骤 dcity是指出发地的城市编码 acity是指目的地的城市编码 其他参数是根据前端页面直接复制过来的,Mac 的Chrome浏览器如下…

Python爬取携程机票代码实例

现在携程的页面是通过接口传递数据的,不能直接使用xpath进行解析,需要模拟调用接口的步骤
dcity是指出发地的城市编码
acity是指目的地的城市编码
其他参数是根据前端页面直接复制过来的,Mac 的Chrome浏览器如下图
在这里插入图片描述

def getResponse(fromCity,toCity,date):cityInfo = judgeCity(fromCity,toCity,date)dcity = cityInfo.get('dcity')dcityName = cityInfo.get('dcityName')acity = cityInfo.get('acity')acityName = cityInfo.get('acityName')token = getToken(fromCity,toCity,date)url = "https://flights.ctrip.com/itinerary/api/12808/products"headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","Referer": "https://flights.ctrip.com/itinerary/oneway/"+dcity+"-"+acity+"?date="+date,"Content-Type": "application/json"}request_payload = {"flightWay": "Oneway","classType": "ALL","hasChild": False,"hasBaby": False,"searchIndex": 1,"airportParams": [{"dcity": dcity, "acity": acity, "dcityname": dcityName, "acityname": acityName, "date": date}],"token": token}# post请求response = requests.post(url, data=json.dumps(request_payload), headers=headers).textreturn response

返回的结果可以在下图中找到具体字段,进行json字段的解析
在这里插入图片描述

def parseInfo(fromCity,toCity,date):airplaneInfo = {}response = getResponse(fromCity,toCity,date)print(response)# 很多航班信息在此分一下routeList = json.loads(response).get('data').get('routeList')# print(routeList)# 依次读取每条信息for route in routeList:# 判断是否有信息,有时候没有会报错if len(route.get('legs')) == 1:legs = route.get('legs')flight = legs[0].get('flight')# 提取想要的信息airlineName = flight.get('airlineName')  # 航空公司flightNumber = flight.get('flightNumber')  # 航班编号departureDate = flight.get('departureDate')  # 出发时间arrivalDate = flight.get('arrivalDate')  # 到达时间craftTypeName = flight.get('craftTypeName')  # 飞机类型craftTypeKindDisplayName = flight.get('craftTypeKindDisplayName')  # 飞机型号:大型;中型,小型departureCityName = flight.get('departureAirportInfo').get('cityName')  # 出发城市departureAirportName = flight.get('departureAirportInfo').get('airportName')  # 出发机场名称departureTerminalName = flight.get('departureAirportInfo').get('terminal').get('name')  # 出发机场航站楼arrivalCityName = flight.get('arrivalAirportInfo').get('cityName')  # 到达城市arrivalAirportName = flight.get('arrivalAirportInfo').get('airportName')  # 到达机场名称arrivalTerminalName = flight.get('arrivalAirportInfo').get('terminal').get('name')  # 到达机场航站楼punctualityRate = flight.get('punctualityRate')  # 到达准点率mealType = flight.get('mealType')  # 是否有餐食  None:代表无餐食,Snack:代表小食,Meal:代表含餐食cabins = legs[0].get('cabins')price = cabins[0].get('price').get('price')  # 标准价格rate = cabins[0].get('price').get('rate')  # 折扣率seatCount = cabins[0].get('seatCount')  # 剩余座位数

遇到的问题,是token的问题,在输入出发地和目的地时,发现token会改变,然后我尝试着去获取到token,找到了一个获取token的接口,如下图:
在这里插入图片描述

def getTokenResponse(fromCity,toCity,date):cityInfo = judgeCity(fromCity,toCity,date)dcity = cityInfo.get('dcity')dcityname = cityInfo.get('dcityname')acity = cityInfo.get('acity')acityname = cityInfo.get('acityname')referer = "https://flights.ctrip.com/itinerary/oneway/" + dcity + "-" + acity + "?date=" + dateurl = "https://flights.ctrip.com/itinerary/api/12808/records"headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","Referer": referer,"Content-Type": "application/json"}request_payload = {"vid": "1586849490183.3km93i"}print(referer)# post请求response = requests.post(url, data=json.dumps(request_payload), headers=headers).textreturn response#获取token
def getToken(fromCity,toCity,date):tokenResponse = getTokenResponse(fromCity,toCity,date)# print(type(tokenResponse))dataInfo_str = json.loads(tokenResponse).get('data')[0].get('data')token = dataInfo_str.replace("\\","").split("\"token\":")[1].split("\"")[1]return token

在获取token后,我输入出发地和目的地后,发现爬取的数据没有变,我就怀疑获取的token是否有问题,然后发现token并没有变,然后我就去前端进行验证,发现这个接口里会有3次缓存,必须刷新三次查询后,token才会改变,但是我用Python模拟三次调用,还是没有获取到,看看广大网友有什么办法
在这里插入图片描述
完整代码

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2020/4/15 15:40
# @Author  : zhangliangliang
# @File    : airTicket_crawler_demo2.py
# @Software: PyCharmimport requests
import json
import random#读取文件
def readFile(path):content_list = []with open(path,'r') as f:for content in f:content_list.append(content.rstrip())return content_list#写入文件
def writeFile(path,text):with open(path,'a') as f:f.write(text)f.write('\n')#清空文件
def truncateFile(path):with open(path, 'w', encoding='utf-8') as f:f.truncate()#整理城市编码
def splitCity():content_list = readFile("/Users/zll/pycharmProjects/studyPython/crawler_poject_base_part1/config/cityInfos")fromCity = {}cityResultList = []# print(content_list)for cityInfo in content_list:# print(cityInfo)cityInfoList = cityInfo.split(":")# print(cityInfoList)cityCode = cityInfoList[0].lower()cityName = cityInfoList[1]# print(cityCode)# print(cityName)cityResult_str = "{"+"\"city\":"+cityCode+","+"\"cityName\":"+cityName+"}"# print(cityResult_str)cityResult_dict = json.loads(cityResult_str)# print(cityResult_dict)cityResultList.append(cityResult_dict)return cityResultListdef judgeCity(fromCity,toCity,date):result = {}for city in splitCity():# print(city)fromResult = fromCity == str(city.get('cityName'))toResult = toCity == str(city.get('cityName'))if (fromResult == True):dcity = city.get('city')dcityName = city.get('cityName')# print(dcity)# print(dcityName)result['dcity'] = dcityresult['dcityName'] = dcityNameelif(toResult == True):acity = city.get('city')acityName = city.get('cityName')# print(acity)# print(acityName)result['acity'] = acityresult['acityName'] = acityNameelse:continuereturn result#随机选取一个代理
def getUserAgent():user_agent_list = readFile("/Users/zll/pycharmProjects/studyPython/crawler_poject_base_part1/config/user_agent.txt")userAgent = random.choice(user_agent_list)return userAgent#随机选取一个代理ip
# def getIp():
#     ip_list = readFile('/Users/zll/pycharmProjects/studyPython/crawler_poject_base_part1/config/ip.txt')
#     # print(ip_list)
#     ip = random.choice(ip_list)
#     return ipdef getTokenResponse(fromCity,toCity,date):cityInfo = judgeCity(fromCity,toCity,date)dcity = cityInfo.get('dcity')dcityname = cityInfo.get('dcityname')acity = cityInfo.get('acity')acityname = cityInfo.get('acityname')referer = "https://flights.ctrip.com/itinerary/oneway/" + dcity + "-" + acity + "?date=" + dateurl = "https://flights.ctrip.com/itinerary/api/12808/records"headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","Referer": referer,"Content-Type": "application/json"}request_payload = {"vid": "1586849490183.3km93i"}print(referer)# post请求response = requests.post(url, data=json.dumps(request_payload), headers=headers).textreturn response#获取token
def getToken(fromCity,toCity,date):tokenResponse = getTokenResponse(fromCity,toCity,date)# print(type(tokenResponse))dataInfo_str = json.loads(tokenResponse).get('data')[0].get('data')token = dataInfo_str.replace("\\","").split("\"token\":")[1].split("\"")[1]return token# 获取返回结果
def getResponse(fromCity,toCity,date):cityInfo = judgeCity(fromCity,toCity,date)dcity = cityInfo.get('dcity')dcityName = cityInfo.get('dcityName')acity = cityInfo.get('acity')acityName = cityInfo.get('acityName')token = getToken(fromCity,toCity,date)url = "https://flights.ctrip.com/itinerary/api/12808/products"headers = {"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","Referer": "https://flights.ctrip.com/itinerary/oneway/"+dcity+"-"+acity+"?date="+date,"Content-Type": "application/json"}request_payload = {"flightWay": "Oneway","classType": "ALL","hasChild": False,"hasBaby": False,"searchIndex": 1,"airportParams": [{"dcity": dcity, "acity": acity, "dcityname": dcityName, "acityname": acityName, "date": date}],"token": token}# post请求response = requests.post(url, data=json.dumps(request_payload), headers=headers).textreturn response# 解析返回结果
def parseInfo(fromCity,toCity,date):airplaneInfo = {}response = getResponse(fromCity,toCity,date)print(response)# 很多航班信息在此分一下routeList = json.loads(response).get('data').get('routeList')# print(routeList)# 依次读取每条信息for route in routeList:# 判断是否有信息,有时候没有会报错if len(route.get('legs')) == 1:legs = route.get('legs')flight = legs[0].get('flight')# 提取想要的信息airlineName = flight.get('airlineName')  # 航空公司flightNumber = flight.get('flightNumber')  # 航班编号departureDate = flight.get('departureDate')  # 出发时间arrivalDate = flight.get('arrivalDate')  # 到达时间craftTypeName = flight.get('craftTypeName')  # 飞机类型craftTypeKindDisplayName = flight.get('craftTypeKindDisplayName')  # 飞机型号:大型;中型,小型departureCityName = flight.get('departureAirportInfo').get('cityName')  # 出发城市departureAirportName = flight.get('departureAirportInfo').get('airportName')  # 出发机场名称departureTerminalName = flight.get('departureAirportInfo').get('terminal').get('name')  # 出发机场航站楼arrivalCityName = flight.get('arrivalAirportInfo').get('cityName')  # 到达城市arrivalAirportName = flight.get('arrivalAirportInfo').get('airportName')  # 到达机场名称arrivalTerminalName = flight.get('arrivalAirportInfo').get('terminal').get('name')  # 到达机场航站楼punctualityRate = flight.get('punctualityRate')  # 到达准点率mealType = flight.get('mealType')  # 是否有餐食  None:代表无餐食,Snack:代表小食,Meal:代表含餐食cabins = legs[0].get('cabins')price = cabins[0].get('price').get('price')  # 标准价格rate = cabins[0].get('price').get('rate')  # 折扣率seatCount = cabins[0].get('seatCount')  # 剩余座位数refundEndorse = cabins[0].get('refundEndorse').get('minRefundFee')  # 成人票:产品退订费minEndorseFee = cabins[0].get('refundEndorse').get('minRefundFee')  # 成人票:产品更改费endorseNote = cabins[0].get('refundEndorse').get('endorseNote')  # 成人票:签转条件freeLuggageAmount = cabins[0].get('freeLuggageAmount')  # 免费托运重量carryonLuggageMaxAmount = cabins[0].get('luggageLimitation').get('carryonLuggageMaxAmount')  # 允许携带手提行李最大数量    0:代表无免费行李额,1:代表一件,-2:代表不限件数carryonLuggageMaxWeight = cabins[0].get('luggageLimitation').get('carryonLuggageMaxWeight')  # 允许携带手提行李最大重量carryonLuggageMaxSize = cabins[0].get('luggageLimitation').get('carryonLuggageMaxSize')  # 允许携带手提行李最大规格checkinLuggageMaxAmount = cabins[0].get('luggageLimitation').get('checkinLuggageMaxAmount')  # 允许托运的行李最大数量类型   0:代表无免费行李额,1:代表一件,-2:代表不限件数checkinLuggageMaxWeight = cabins[0].get('luggageLimitation').get('checkinLuggageMaxWeight')  # 允许托运的行李最大重量checkinLuggageMaxSize = cabins[0].get('luggageLimitation').get('checkinLuggageMaxSize')  # 允许托运的行李最大规格characteristic = legs[0].get('characteristic')lowestPrice = characteristic.get('lowestPrice')  # 成人经济舱最低价lowestCfPrice = characteristic.get('lowestCfPrice')  # 成人公务舱最低价lowestChildPrice = characteristic.get('lowestChildPrice')  # 儿童经济舱最低价lowestChildCfPrice = characteristic.get('lowestChildCfPrice')  # 儿童公务舱最低价#将数据放入字典airplaneInfo["airlineName"] = airlineNameairplaneInfo["flightNumber"] = flightNumberairplaneInfo["departureDate"] = departureDateairplaneInfo["arrivalDate"] = arrivalDateairplaneInfo["craftTypeName"] = craftTypeNameairplaneInfo["craftTypeKindDisplayName"] = craftTypeKindDisplayNameairplaneInfo["departureCityName"] = departureCityNameairplaneInfo["departureAirportName"] = departureAirportNameairplaneInfo["departureTerminalName"] = departureTerminalNameairplaneInfo["arrivalCityName"] = arrivalCityNameairplaneInfo["arrivalAirportName"] = arrivalAirportNameairplaneInfo["arrivalTerminalName"] = arrivalTerminalNameairplaneInfo["punctualityRate"] = punctualityRateairplaneInfo["mealType"] = mealTypeairplaneInfo["price"] = priceairplaneInfo["rate"] = rateairplaneInfo["seatCount"] = seatCountairplaneInfo["refundEndorse"] = refundEndorseairplaneInfo["minEndorseFee"] = minEndorseFeeairplaneInfo["endorseNote"] = endorseNoteairplaneInfo["freeLuggageAmount"] = freeLuggageAmountairplaneInfo["carryonLuggageMaxAmount"] = carryonLuggageMaxAmountairplaneInfo["carryonLuggageMaxWeight"] = carryonLuggageMaxWeightairplaneInfo["carryonLuggageMaxSize"] = carryonLuggageMaxSizeairplaneInfo["checkinLuggageMaxAmount"] = checkinLuggageMaxAmountairplaneInfo["checkinLuggageMaxWeight"] = checkinLuggageMaxWeightairplaneInfo["checkinLuggageMaxSize"] = checkinLuggageMaxSizeairplaneInfo["lowestPrice"] = lowestPriceairplaneInfo["lowestCfPrice"] = lowestCfPriceairplaneInfo["lowestChildPrice"] = lowestChildPriceairplaneInfo["lowestChildCfPrice"] = lowestChildCfPriceprint(airlineName, "\t",flightNumber, "\t",departureDate, "\t",arrivalDate, "\t",craftTypeName, "\t",craftTypeKindDisplayName, "\t",departureCityName, "\t",departureAirportName, "\t",departureTerminalName, "\t",arrivalCityName, "\t",arrivalAirportName, "\t",arrivalTerminalName, "\t",punctualityRate, "\t",mealType, "\t",price, "\t",rate, "\t",seatCount, "\t",refundEndorse, "\t",minEndorseFee, "\t",endorseNote, "\t",freeLuggageAmount, "\t",carryonLuggageMaxAmount, "\t",carryonLuggageMaxWeight, "\t",carryonLuggageMaxSize, "\t",checkinLuggageMaxAmount, "\t",checkinLuggageMaxWeight, "\t",checkinLuggageMaxSize, "\t",lowestPrice, "\t",lowestCfPrice, "\t",lowestChildPrice, "\t",lowestChildCfPrice, "\t")print(airplaneInfo)if __name__ == "__main__":fromCity = input("出发城市:")toCity = input("目的城市:")date = input("出发日期:")# re = getResponse(fromCity,toCity,date)# print(re)# parseInfo(fromCity,toCity,date)# m = judgeCity(fromCity,toCity)# printa(m)a = getResponse(fromCity,toCity,date)print(a)t = getToken(fromCity,toCity,date)print(t)

城市编码如下

"AAT":"阿勒泰"
"ACX":"兴义"
"AEB":"百色"
"AKU":"阿克苏"
"AOG":"鞍山"
"AQG":"安庆"
"AVA":"安顺"
"AXF":"阿拉善左旗"
"BAV":"包头"
"BFJ":"毕节"
"BHY":"北海"
"BJS":"北京"
"BPE":"秦皇岛"
"BPL":"博乐"
"BPX":"昌都"
"BSD":"保山"
"CAN":"广州"
"CDE":"承德"
"CGD":"常德"
"CGO":"郑州"
"CGQ":"长春"
"CHG":"朝阳"
"CIF":"赤峰"
"CIH":"长治"
"CKG":"重庆"
"CSX":"长沙"
"CTU":"成都"
"CWJ":"沧源"
"CYI":"嘉义"
"CZX":"常州"
"DAT":"大同"
"DAX":"达县"
"DBC":"白城"
"DCY":"稻城"
"DDG":"丹东"
"DIG":"香格里拉迪庆)"
"DLC":"大连"
"DLU":"大理"
"DNH":"敦煌"
"DOY":"东营"
"DQA":"大庆"
"DSN":"鄂尔多斯"
"DYG":"张家界"
"EJN":"额济纳旗"
"ENH":"恩施"
"ENY":"延安"
"ERL":"二连浩特"
"FOC":"福州"
"FUG":"阜阳"
"FUO":"佛山"
"FYJ":"抚远"
"GOQ":"格尔木"
"GYS":"广元"
"GYU":"固原"
"HAK":"海口"
"HDG":"邯郸"
"HEK":"黑河"
"HET":"呼和浩特"
"HFE":"合肥"
"HGH":"杭州"
"HIA":"淮安"
"HJJ":"怀化"
"HKG":"香港"
"HLD":"海拉尔"
"HLH":"乌兰浩特"
"HMI":"哈密"
"HPG":"神农架"
"HRB":"哈尔滨"
"HSN":"舟山"
"HTN":"和田"
"HUZ":"惠州"
"HYN":"台州"
"HZG":"汉中"
"HZH":"黎平"
"INC":"银川"
"IQM":"且末"
"IQN":"庆阳"
"JDZ":"景德镇"
"JGD":"加格达奇"
"JGN":"嘉峪关"
"JGS":"井冈山"
"JHG":"西双版纳"
"JIC":"金昌"
"JIQ":"黔江"
"JIU":"九江"
"JJN":"晋江"
"JMJ":"澜沧"
"JMU":"佳木斯"
"JNG":"济宁"
"JNZ":"锦州"
"JSJ":"建三江"
"JUH":"池州"
"JUZ":"衢州"
"JXA":"鸡西"
"JZH":"九寨沟"
"KCA":"库车"
"KGT":"康定"
"KHG":"喀什"
"KHN":"南昌"
"KJH":"凯里"
"KMG":"昆明"
"KNH":"金门"
"KOW":"赣州"
"KRL":"库尔勒"
"KRY":"克拉玛依"
"KWE":"贵阳"
"KWL":"桂林"
"LCX":"龙岩"
"LDS":"伊春"
"LFQ":"临汾"
"LHW":"兰州"
"LJG":"丽江"
"LLB":"荔波"
"LLF":"永州"
"LLV":"吕梁"
"LNJ":"临沧"
"LPF":"六盘水"
"LUM":"芒市"
"LXA":"拉萨"
"LYA":"洛阳"
"LYG":"连云港"
"LYI":"临沂"
"LZH":"柳州"
"LZO":"泸州"
"LZY":"林芝"
"MDG":"牡丹江"
"MFK":"马祖"
"MFM":"澳门"
"MIG":"绵阳"
"MXZ":"梅州"
"NAO":"南充"
"NBS":"白山"
"NDG":"齐齐哈尔"
"NGB":"宁波"
"NGQ":"阿里"
"NKG":"南京"
"NLH":"宁蒗"
"NNG":"南宁"
"NNY":"南阳"
"NTG":"南通"
"NZH":"满洲里"
"OHE":"漠河"
"PZI":"攀枝花"
"RHT":"阿拉善右旗"
"RIZ":"日照"
"RKZ":"日喀则"
"RLK":"巴彦淖尔"
"SHA":"上海"
"SHE":"沈阳"
"SIA":"西安"
"SJW":"石家庄"
"SWA":"揭阳"
"SYM":"普洱"
"SYX":"三亚"
"SZX":"深圳"
"TAO":"青岛"
"TCG":"塔城"
"TCZ":"腾冲"
"TEN":"铜仁"
"TGO":"通辽"
"THQ":"天水"
"TLQ":"吐鲁番"
"TNA":"济南"
"TSN":"天津"
"TVS":"唐山"
"TXN":"黄山"
"TYN":"太原"
"URC":"乌鲁木齐"
"UYN":"榆林"
"WEF":"潍坊"
"WEH":"威海"
"WMT":"遵义(茅台)"
"WNH":"文山"
"WNZ":"温州"
"WUA":"乌海"
"WUH":"武汉"
"WUS":"武夷山"
"WUX":"无锡"
"WUZ":"梧州"
"WXN":"万州"
"XFN":"襄阳"
"XIC":"西昌"
"XIL":"锡林浩特"
"XMN":"厦门"
"XNN":"西宁"
"XUZ":"徐州"
"YBP":"宜宾"
"YCU":"运城"
"YIC":"宜春"
"YIE":"阿尔山"
"YIH":"宜昌"
"YIN":"伊宁"
"YIW":"义乌"
"YNJ":"延吉"
"YNT":"烟台"
"YNZ":"盐城"
"YTY":"扬州"
"YUS":"玉树"
"YZY":"张掖"
"ZAT":"昭通"
"ZHA":"湛江"
"ZHY":"中卫"
"ZQZ":"张家口"
"ZUH":"珠海"
"ZYI":"遵义(新舟)"

.
.
.
.
.
下面是我的公众号,收集了现在主流的大数据技能和架构,欢迎大家一起来学习交流。
在这里插入图片描述