引言#
在學術研究中,管理參考文獻是一項重要但耗時的工作。尤其是在寫論文時,我們經常需要從不同的數據庫中獲取文獻的引用格式。為了解決這個問題,我開發了 get-bibtex
這個 Python 庫,它可以幫助研究者快速從多個學術數據庫獲取 BibTeX 格式的引用。
為什麼選擇 get-bibtex?#
1. 多源支持#
- CrossRef(最全面的 DOI 數據庫)
- DBLP(計算機科學文獻數據庫)
- Google Scholar(需要 API key)
2. 智能工作流#
from get_bibtex import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
# 創建工作流
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX("[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# 批量處理
papers = [
"10.1145/3292500.3330919", # 用 DOI
"Attention is all you need" # 用標題
]
results = workflow.get_multiple_bibtex(papers)
3. 簡單易用#
from get_bibtex import CrossRefBibTeX
# 單個引用獲取
fetcher = CrossRefBibTeX(email="[email protected]")
bibtex = fetcher.get_bibtex("10.1145/3292500.3330919")
print(bibtex)
4. 文件批處理#
# 從文件讀取並保存
workflow.process_file(
input_path="papers.txt",
output_path="references.bib"
)
特色功能#
-
智能回退機制
- 當一個數據源失敗時,自動嘗試其他數據源
- 保證最大程度獲取引用信息
-
進度追蹤
- 使用 tqdm 顯示處理進度
- 清晰掌握批量處理狀態
-
錯誤處理
- 詳細的日誌記錄
- 優雅處理 API 限制和網絡錯誤
-
格式化輸出
- 自動清理和格式化 BibTeX
- 確保輸出格式的一致性
使用場景#
論文寫作#
當你在寫論文時,可以直接用 DOI 或標題獲取引用:
from get_bibtex import CrossRefBibTeX
fetcher = CrossRefBibTeX()
citations = [
"Machine learning",
"Deep learning",
"10.1038/nature14539"
]
for citation in citations:
bibtex = fetcher.get_bibtex(citation)
print(bibtex)
文獻綜述#
批量處理大量文獻引用:
from get_bibtex import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX())
workflow.add_fetcher(DBLPBibTeX())
# 從文件讀取文獻列表
workflow.process_file("papers.txt", "bibliography.bib")
獲取注意力機制相關論文的引用#
假設我們需要獲取以下論文的引用:
- FedMSA: 聯邦學習中的模型選擇與適應系統
- Attention Is All You Need: Transformer 的開創性工作
- Non-Local Neural Networks: 非局部神經網絡
- ECA-Net: 高效的通道注意力機制
- CBAM: 卷積塊注意力模塊
使用 CrossRef 獲取(通過 DOI)#
from apiModels import CrossRefBibTeX
fetcher = CrossRefBibTeX(email="[email protected]")
# FedMSA
bibtex = fetcher.get_bibtex("10.3390/s22197244")
print(bibtex)
# ECA-Net
bibtex = fetcher.get_bibtex("10.1109/cvpr42600.2020.01155")
print(bibtex)
輸出示例:
@article{Sun_2022,
title={FedMSA: A Model Selection and Adaptation System for Federated Learning},
volume={22},
ISSN={1424-8220},
url={http://dx.doi.org/10.3390/s22197244},
DOI={10.3390/s22197244},
number={19},
journal={Sensors},
publisher={MDPI AG},
author={Sun, Rui and Li, Yinhao and Shah, Tejal and Sham, Ringo W. H. and Szydlo, Tomasz and Qian, Bin and Thakker, Dhaval and Ranjan, Rajiv},
year={2022},
month=sep,
pages={7244}
}
使用 DBLP 獲取(通過標題)#
from apiModels import DBLPBibTeX
fetcher = DBLPBibTeX()
# CBAM
bibtex = fetcher.get_bibtex("CBAM: Convolutional Block Attention Module")
print(bibtex)
輸出示例:
@article{DBLP:journals/access/WangZHLL24,
author = {Niannian Wang and Zexi Zhang and Haobang Hu and Bin Li and Jianwei Lei},
title = {Underground Defects Detection Based on {GPR} by Fusing Simple Linear Iterative Clustering Phash (SLIC-Phash) and Convolutional Block Attention Module (CBAM)-YOLOv8},
journal = {{IEEE} Access},
volume = {12},
pages = {25888--25905},
year = {2024},
url = {https://doi.org/10.1109/ACCESS.2024.3365959},
doi = {10.1109/ACCESS.2024.3365959}
}
使用工作流獲取多個引用#
from apiModels import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
# 創建工作流
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX(email="[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# 準備查詢列表
queries = [
"FedMSA: A Model Selection and Adaptation System for Federated Learning",
"Attention Is All You Need",
"Non-Local Neural Networks",
"ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks",
"CBAM: Convolutional Block Attention Module"
]
# 獲取所有引用
results = workflow.get_multiple_bibtex(queries)
# 打印結果
for query, bibtex in results.items():
print(f"\n查詢: {query}")
print(f"引用:\n{bibtex if bibtex else '未找到'}")
文件批處理#
你可以創建一個工作流來處理包含多個引用的文件。首先,創建工作流並添加數據源:
from apiModels import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX(email="[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# 處理文件
workflow.process_file("papers.txt", "references.bib")
輸入示例:
papers.txt
FedMSA: A Model Selection and Adaptation System for Federated Learning
Attention Is All You Need
Non-Local Neural Networks
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
CBAM: Convolutional Block Attention Module
輸出示例:
references.bib
% 查詢: FedMSA: A Model Selection and Adaptation System for Federated Learning
% 來源: CrossRefBibTeX
@article{Sun_2022, title={FedMSA: A Model Selection and Adaptation System for Federated Learning}, volume={22}, ISSN={1424-8220}, url={http://dx.doi.org/10.3390/s22197244}, DOI={10.3390/s22197244}, number={19}, journal={Sensors}, publisher={MDPI AG}, author={Sun, Rui and Li, Yinhao and Shah, Tejal and Sham, Ringo W. H. and Szydlo, Tomasz and Qian, Bin and Thakker, Dhaval and Ranjan, Rajiv}, year={2022}, month=sep, pages={7244} }
% 查詢: Attention Is All You Need
% 來源: DBLPBibTeX
@inproceedings{DBLP:conf/dac/ZhangYY21,
author = {Xiaopeng Zhang and
Haoyu Yang and
Evangeline F. Y. Young},
title = {Attentional Transfer is All You Need: Technology-aware Layout Pattern
Generation},
booktitle = {58th {ACM/IEEE} Design Automation Conference, {DAC} 2021, San Francisco,
CA, USA, December 5-9, 2021},
pages = {169--174},
publisher = {{IEEE}},
year = {2021},
url = {https://doi.org/10.1109/DAC18074.2021.9586227},
doi = {10.1109/DAC18074.2021.9586227},
timestamp = {Wed, 03 May 2023 17:06:11 +0200},
biburl = {https://dblp.org/rec/conf/dac/ZhangYY21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
% 查詢: Non-Local Neural Networks
% 來源: CrossRefBibTeX
@article{Xu_2024, title={Adaptive selection of local and non-local attention mechanisms for speech enhancement}, volume={174}, ISSN={0893-6080}, url={http://dx.doi.org/10.1016/j.neunet.2024.106236}, DOI={10.1016/j.neunet.2024.106236}, journal={Neural Networks}, publisher={Elsevier BV}, author={Xu, Xinmeng and Tu, Weiping and Yang, Yuhong}, year={2024}, month=jun, pages={106236} }
% 查詢: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
% 來源: CrossRefBibTeX
@inproceedings{Wang_2020, title={ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks}, url={http://dx.doi.org/10.1109/cvpr42600.2020.01155}, DOI={10.1109/cvpr42600.2020.01155}, booktitle={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, publisher={IEEE}, author={Wang, Qilong and Wu, Banggu and Zhu, Pengfei and Li, Peihua and Zuo, Wangmeng and Hu, Qinghua}, year={2020}, month=jun, pages={11531–11539} }
% 查詢: CBAM: Convolutional Block Attention Module
% 來源: DBLPBibTeX
@article{DBLP:journals/access/WangZHLL24,
author = {Niannian Wang and
Zexi Zhang and
Haobang Hu and
Bin Li and
Jianwei Lei},
title = {Underground Defects Detection Based on {GPR} by Fusing Simple Linear
Iterative Clustering Phash (SLIC-Phash) and Convolutional Block Attention
Module (CBAM)-YOLOv8},
journal = {{IEEE} Access},
volume = {12},
pages = {25888--25905},
year = {2024},
url = {https://doi.org/10.1109/ACCESS.2024.3365959},
doi = {10.1109/ACCESS.2024.3365959},
timestamp = {Sat, 16 Mar 2024 15:09:59 +0100},
biburl = {https://dblp.org/rec/journals/access/WangZHLL24.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
安裝方式#
使用 pip:
pip install get-bibtex
使用 Poetry:
poetry add get-bibtex
最佳實踐#
-
使用郵箱註冊
fetcher = CrossRefBibTeX(email="[email protected]")
這樣可以獲得更好的 API 訪問優先級
-
合理使用工作流
workflow = WorkflowBuilder() workflow.add_fetcher(CrossRefBibTeX()) # 主要源 workflow.add_fetcher(DBLPBibTeX()) # 備用源
按照可靠性順序添加數據源
-
批量處理時添加延時
處理大量引用時,建議使用內置的延時機制,避免觸發 API 限制 -
獲取 SerpAPI Key
要使用 Google Scholar 功能,你需要一個 SerpAPI key。以下是獲取步驟:
-
註冊 SerpAPI 賬號
- 訪問 SerpAPI 官網
- 點擊右上角的 "Sign Up" 按鈕
- 填寫註冊信息(郵箱、密碼等)
-
選擇合適的計劃
- 免費計劃:每月 100 次搜索
- 付費計劃:根據需求選擇不同級別
- 對於測試和個人使用,免費計劃通常足夠
-
獲取 API Key
- 登錄後進入 Dashboard
- 在 "API Key" 部分找到你的密鑰
- 複製密鑰以在代碼中使用
-
使用示例
from apiModels import GoogleScholarBibTeX # 初始化 Google Scholar 獲取器 fetcher = GoogleScholarBibTeX(api_key="your-serpapi-key") # 獲取引用 bibtex = fetcher.get_bibtex("Deep learning with differential privacy") print(bibtex)
-
注意事項
- 保護好你的 API key,不要公開分享
- 監控使用量,避免超出限制
- 合理設置請求間隔(建議至少 1 秒)
- 在生產環境中使用環境變量存儲 API key
import os api_key = os.getenv("SERPAPI_KEY") fetcher = GoogleScholarBibTeX(api_key=api_key)
-
使用建議
- 優先使用 CrossRef 和 DBLP
- 只在找不到結果時使用 Google Scholar
- 批量處理時注意 API 使用限制
# 推薦的工作流順序 workflow = WorkflowBuilder() workflow.add_fetcher(CrossRefBibTeX(email="[email protected]")) workflow.add_fetcher(DBLPBibTeX()) workflow.add_fetcher(GoogleScholarBibTeX(api_key="your-serpapi-key"))
-
未來展望#
- 支持更多數據源
- 添加引用格式轉換功能
- 提供圖形用戶界面
- 支持更多自定義選項
結語#
get-bibtex
致力於簡化學術寫作中的文獻管理工作。無論是單篇論文還是文獻綜述,它都能幫助你高效地獲取和管理文獻引用。歡迎通過 GitHub 參與項目開發,提出建議或反饋問題。
相關鏈接#
- GitHub 倉庫:get-bibtex
- 問題反饋:Issues
- PyPI 頁面:get-bibtex
此文由 Mix Space 同步更新至 xLog
原始鏈接為 https://liuyaowen.cn/posts/person/20241231