需求:在使用 latex 写论文的时候,你是否有这个需求,需要将引用转换为 bibtex 格式,如果文献量很大,这个重复工作实在不值得做,如果你实现使用了文献管理工具,例如 endnote、zotero,可以一件导出,但是没有的话,本文提供一个解决方案
方案:crossref API+google scholar API
crossref 是最大的外文 doi 发布平台,基本包含了所有的外文文献的元数据,但是也有一些包括不限于 arXiv 等文献是查询不到了,这个时候需要 google scholar 帮忙
为了节省大家的时间,这两个 api 我已经进行了封装,只需要使用 pip 下载下来
pip install get_bibtex
之后可以按照下面的使用方法
from apiModels.get_bibtex_from_crossref import GetBibTex
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
if __name__ == '__main__':
google_scholar_api_key = "your_google_scholar_api_key"
get_bibtex_from_crossref = GetBibTex("[email protected]")
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(google_scholar_api_key, GetBibTexFromGoogleScholar.APA)
with open("inputfile/Bibliographyraw.txt", "r", encoding='utf-8') as f:
raws = f.readlines()
# get bibtex from CrossRef and failed search results
success_bibtexs_crossref, failed_results = get_bibtex_from_crossref.get_bibtexs(raws)
# for each failed search result, get bibtex from Google Scholar
success_bibtexs_google, failed_results = get_bibtex_from_google_scholar.get_bibtexs(failed_results)
with open("outputfile/BibliographyCrossRef.txt", "w", encoding='utf-8') as f:
for bibtex in success_bibtexs_crossref:
f.write(bibtex)
with open("outputfile/BibliographyGoogleScholar.txt", "w", encoding='utf-8') as f:
for index, bibtex in enumerate(success_bibtexs_google):
f.write("[]".format(index) + " " + bibtex + "\n")
with open("outputfile/not_find.txt", "w", encoding='utf-8') as f:
for result in failed_results:
f.write(result+"\n")
print("find bibtex from CrossRef: ", len(success_bibtexs_crossref))
print("find bibtex from Google Scholar: ", len(success_bibtexs_google))
print("not find: ", len(failed_results))
关键代码解释
Bibliographyraw.txt里面是需要查询的文件
例如:
J. Hu, L. Shen, S. Albanie, G. Sun, and A. Vedaldi, “Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks.” arXiv, Jan. 12, 2019. doi: 10.48550/arXiv.1810.12348.
X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local Neural Networks.” arXiv, Apr. 13, 2018. doi: 10.48550/arXiv.1711.07971.
------------------
success_bibtexs_crossref, failed_results = get_bibtex_from_crossref.get_bibtexs(raws)
返回的第一个参数为bibtex列表,第二个为没有查询到的原文献
success_bibtexs_google, failed_results = get_bibtex_from_google_scholar.get_bibtexs(failed_results)
将没有查到的文献继续使用google API查询,一般都是可以查询到了,没有文献在google scholar查询不到了吧
提示,这里返回的其实是APA格式的,在上面初始化指定,也有一个参数可以设置返回bibtex格式,例如
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(google_scholar_api_key, GetBibTexFromGoogleScholar.APA, flag = True)
但是需要设置代理服务器,例如:
import os
import re
import requests
os.environ["http_proxy"]="127.0.0.1:7890"
os.environ["https_proxy"]="127.0.0.1:7890"
!!!!!!!注意:需要先去申请 API,每个月有 100 的免费查询次数,一般是够用的,在 serpapi.com 申请
之后的代码都一目了然了哈哈
当然也有请求单个查询的:
get_bibtex() 去掉s就可以了
2024.4.16 更新#
1、添加了 DBLP 接口
from apiModels.get_bibtex_from_dblp import GetBibTexFromDBLP
2、提高使用便捷性
现在提供一个封装好的类提供使用,这个方法已经封装好了Crosref和DBLP的API
from apiModels.workflow.crossref2dblp import Crossref2Dblp
使用方法(无google scholar API)
crossref2dblp = Crossref2Dblp("your email", "inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt")
crossref2dblp.running()
坐等运行完成
(有了google scholar API)
from apiModels.workflow.crossref2dblp import Crossref2Dblp
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(api_key="your api key")
在最后面参数加上你封装的API
crossref2dblp = Crossref2Dblp("[email protected]", "inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt",get_bibtex_from_google_scholar)
crossref2dblp.running()
坐等运行完成
或者你想自己定义api之间的调用顺序
from apiModels.workflow.make_workflow import MakeWorkflow
from apiModels.get_bibtex_from_google_scholar import GetBibTexFromGoogleScholar
from apiModels.get_bibtex_from_crossref import GetBibTex
get_bibtex_from_google_scholar = GetBibTexFromGoogleScholar(api_key="your api key")
get_bibtex_from_crossref = GetBibTex("[email protected]")
make_workflow = MakeWorkflow("inputfile/Bibliographyraw.txt", "outputfile/Bibliography.txt", get_bibtex_from_google_scholar, get_bibtex_from_crossref)
make_workflow.running()
使用之前:
pip install get_bibtex = 1.1.0
欢迎改进
此文由 Mix Space 同步更新至 xLog
原始链接为 https://me.liuyaowen.club/posts/default/20240816and2