Introduction#
In academic research, managing references is an important but time-consuming task. Especially when writing papers, we often need to obtain citation formats from different databases. To solve this problem, I developed the get-bibtex
Python library, which helps researchers quickly obtain BibTeX format citations from multiple academic databases.
Why Choose get-bibtex?#
1. Multi-source Support#
- CrossRef (the most comprehensive DOI database)
- DBLP (computer science literature database)
- Google Scholar (requires API key)
2. Intelligent Workflow#
from get_bibtex import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
# Create workflow
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX("[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# Batch processing
papers = [
"10.1145/3292500.3330919", # Using DOI
"Attention is all you need" # Using title
]
results = workflow.get_multiple_bibtex(papers)
3. Simple and Easy to Use#
from get_bibtex import CrossRefBibTeX
# Single citation retrieval
fetcher = CrossRefBibTeX(email="[email protected]")
bibtex = fetcher.get_bibtex("10.1145/3292500.3330919")
print(bibtex)
4. File Batch Processing#
# Read from file and save
workflow.process_file(
input_path="papers.txt",
output_path="references.bib"
)
Featured Functions#
-
Intelligent Fallback Mechanism
- Automatically tries other data sources when one fails
- Ensures maximum retrieval of citation information
-
Progress Tracking
- Displays processing progress using tqdm
- Clearly understand the status of batch processing
-
Error Handling
- Detailed logging
- Gracefully handles API limits and network errors
-
Formatted Output
- Automatically cleans and formats BibTeX
- Ensures consistency of output format
Use Cases#
Paper Writing#
When you are writing a paper, you can directly obtain citations using DOI or title:
from get_bibtex import CrossRefBibTeX
fetcher = CrossRefBibTeX()
citations = [
"Machine learning",
"Deep learning",
"10.1038/nature14539"
]
for citation in citations:
bibtex = fetcher.get_bibtex(citation)
print(bibtex)
Literature Review#
Batch process a large number of literature citations:
from get_bibtex import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX())
workflow.add_fetcher(DBLPBibTeX())
# Read literature list from file
workflow.process_file("papers.txt", "bibliography.bib")
Obtaining Citations for Papers Related to Attention Mechanisms#
Suppose we need to obtain citations for the following papers:
- FedMSA: A Model Selection and Adaptation System for Federated Learning
- Attention Is All You Need: A Pioneering Work on Transformers
- Non-Local Neural Networks
- ECA-Net: Efficient Channel Attention Mechanism
- CBAM: Convolutional Block Attention Module
Using CrossRef to Retrieve (via DOI)#
from apiModels import CrossRefBibTeX
fetcher = CrossRefBibTeX(email="[email protected]")
# FedMSA
bibtex = fetcher.get_bibtex("10.3390/s22197244")
print(bibtex)
# ECA-Net
bibtex = fetcher.get_bibtex("10.1109/cvpr42600.2020.01155")
print(bibtex)
Example Output:
@article{Sun_2022,
title={FedMSA: A Model Selection and Adaptation System for Federated Learning},
volume={22},
ISSN={1424-8220},
url={http://dx.doi.org/10.3390/s22197244},
DOI={10.3390/s22197244},
number={19},
journal={Sensors},
publisher={MDPI AG},
author={Sun, Rui and Li, Yinhao and Shah, Tejal and Sham, Ringo W. H. and Szydlo, Tomasz and Qian, Bin and Thakker, Dhaval and Ranjan, Rajiv},
year={2022},
month=sep,
pages={7244}
}
Using DBLP to Retrieve (via Title)#
from apiModels import DBLPBibTeX
fetcher = DBLPBibTeX()
# CBAM
bibtex = fetcher.get_bibtex("CBAM: Convolutional Block Attention Module")
print(bibtex)
Example Output:
@article{DBLP:journals/access/WangZHLL24,
author = {Niannian Wang and Zexi Zhang and Haobang Hu and Bin Li and Jianwei Lei},
title = {Underground Defects Detection Based on {GPR} by Fusing Simple Linear Iterative Clustering Phash (SLIC-Phash) and Convolutional Block Attention Module (CBAM)-YOLOv8},
journal = {{IEEE} Access},
volume = {12},
pages = {25888--25905},
year = {2024},
url = {https://doi.org/10.1109/ACCESS.2024.3365959},
doi = {10.1109/ACCESS.2024.3365959}
}
Using Workflow to Retrieve Multiple Citations#
from apiModels import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
# Create workflow
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX(email="[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# Prepare query list
queries = [
"FedMSA: A Model Selection and Adaptation System for Federated Learning",
"Attention Is All You Need",
"Non-Local Neural Networks",
"ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks",
"CBAM: Convolutional Block Attention Module"
]
# Retrieve all citations
results = workflow.get_multiple_bibtex(queries)
# Print results
for query, bibtex in results.items():
print(f"\nQuery: {query}")
print(f"Citation:\n{bibtex if bibtex else 'Not found'}")
File Batch Processing#
You can create a workflow to process a file containing multiple citations. First, create the workflow and add data sources:
from apiModels import WorkflowBuilder, CrossRefBibTeX, DBLPBibTeX
workflow = WorkflowBuilder()
workflow.add_fetcher(CrossRefBibTeX(email="[email protected]"))
workflow.add_fetcher(DBLPBibTeX())
# Process file
workflow.process_file("papers.txt", "references.bib")
Example Input:
papers.txt
FedMSA: A Model Selection and Adaptation System for Federated Learning
Attention Is All You Need
Non-Local Neural Networks
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
CBAM: Convolutional Block Attention Module
Example Output:
references.bib
% Query: FedMSA: A Model Selection and Adaptation System for Federated Learning
% Source: CrossRefBibTeX
@article{Sun_2022, title={FedMSA: A Model Selection and Adaptation System for Federated Learning}, volume={22}, ISSN={1424-8220}, url={http://dx.doi.org/10.3390/s22197244}, DOI={10.3390/s22197244}, number={19}, journal={Sensors}, publisher={MDPI AG}, author={Sun, Rui and Li, Yinhao and Shah, Tejal and Sham, Ringo W. H. and Szydlo, Tomasz and Qian, Bin and Thakker, Dhaval and Ranjan, Rajiv}, year={2022}, month=sep, pages={7244} }
% Query: Attention Is All You Need
% Source: DBLPBibTeX
@inproceedings{DBLP:conf/dac/ZhangYY21,
author = {Xiaopeng Zhang and
Haoyu Yang and
Evangeline F. Y. Young},
title = {Attentional Transfer is All You Need: Technology-aware Layout Pattern
Generation},
booktitle = {58th {ACM/IEEE} Design Automation Conference, {DAC} 2021, San Francisco,
CA, USA, December 5-9, 2021},
pages = {169--174},
publisher = {{IEEE}},
year = {2021},
url = {https://doi.org/10.1109/DAC18074.2021.9586227},
doi = {10.1109/DAC18074.2021.9586227},
timestamp = {Wed, 03 May 2023 17:06:11 +0200},
biburl = {https://dblp.org/rec/conf/dac/ZhangYY21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
% Query: Non-Local Neural Networks
% Source: CrossRefBibTeX
@article{Xu_2024, title={Adaptive selection of local and non-local attention mechanisms for speech enhancement}, volume={174}, ISSN={0893-6080}, url={http://dx.doi.org/10.1016/j.neunet.2024.106236}, DOI={10.1016/j.neunet.2024.106236}, journal={Neural Networks}, publisher={Elsevier BV}, author={Xu, Xinmeng and Tu, Weiping and Yang, Yuhong}, year={2024}, month=jun, pages={106236} }
% Query: ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
% Source: CrossRefBibTeX
@inproceedings{Wang_2020, title={ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks}, url={http://dx.doi.org/10.1109/cvpr42600.2020.01155}, DOI={10.1109/cvpr42600.2020.01155}, booktitle={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, publisher={IEEE}, author={Wang, Qilong and Wu, Banggu and Zhu, Pengfei and Li, Peihua and Zuo, Wangmeng and Hu, Qinghua}, year={2020}, month=jun, pages={11531–11539} }
% Query: CBAM: Convolutional Block Attention Module
% Source: DBLPBibTeX
@article{DBLP:journals/access/WangZHLL24,
author = {Niannian Wang and
Zexi Zhang and
Haobang Hu and
Bin Li and
Jianwei Lei},
title = {Underground Defects Detection Based on {GPR} by Fusing Simple Linear
Iterative Clustering Phash (SLIC-Phash) and Convolutional Block Attention
Module (CBAM)-YOLOv8},
journal = {{IEEE} Access},
volume = {12},
pages = {25888--25905},
year = {2024},
url = {https://doi.org/10.1109/ACCESS.2024.3365959},
doi = {10.1109/ACCESS.2024.3365959},
timestamp = {Sat, 16 Mar 2024 15:09:59 +0100},
biburl = {https://dblp.org/rec/journals/access/WangZHLL24.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Installation Method#
Using pip:
pip install get-bibtex
Using Poetry:
poetry add get-bibtex
Best Practices#
-
Register with Email
fetcher = CrossRefBibTeX(email="[email protected]")
This can provide better API access priority.
-
Use Workflow Wisely
workflow = WorkflowBuilder() workflow.add_fetcher(CrossRefBibTeX()) # Primary source workflow.add_fetcher(DBLPBibTeX()) # Backup source
Add data sources in order of reliability.
-
Add Delay When Batch Processing
When processing a large number of citations, it is recommended to use the built-in delay mechanism to avoid triggering API limits. -
Obtain SerpAPI Key
To use Google Scholar features, you need a SerpAPI key. Here are the steps to obtain it:
-
Register for a SerpAPI account
- Visit SerpAPI official website
- Click the "Sign Up" button in the upper right corner
- Fill in the registration information (email, password, etc.)
-
Choose a suitable plan
- Free plan: 100 searches per month
- Paid plans: Choose different levels based on needs
- For testing and personal use, the free plan is usually sufficient.
-
Obtain API Key
- After logging in, go to the Dashboard
- Find your key in the "API Key" section
- Copy the key for use in your code.
-
Usage Example
from apiModels import GoogleScholarBibTeX # Initialize Google Scholar fetcher fetcher = GoogleScholarBibTeX(api_key="your-serpapi-key") # Get citation bibtex = fetcher.get_bibtex("Deep learning with differential privacy") print(bibtex)
-
Notes
- Protect your API key and do not share it publicly.
- Monitor usage to avoid exceeding limits.
- Set reasonable request intervals (at least 1 second is recommended).
- Use environment variables to store API keys in production environments.
import os api_key = os.getenv("SERPAPI_KEY") fetcher = GoogleScholarBibTeX(api_key=api_key)
-
Usage Recommendations
- Prioritize using CrossRef and DBLP.
- Use Google Scholar only when results are not found.
- Be mindful of API usage limits when batch processing.
# Recommended workflow order workflow = WorkflowBuilder() workflow.add_fetcher(CrossRefBibTeX(email="[email protected]")) workflow.add_fetcher(DBLPBibTeX()) workflow.add_fetcher(GoogleScholarBibTeX(api_key="your-serpapi-key"))
-
Future Prospects#
- Support for more data sources
- Adding citation format conversion features
- Providing a graphical user interface
- Supporting more customization options
Conclusion#
get-bibtex
is dedicated to simplifying the management of literature in academic writing. Whether for a single paper or a literature review, it can help you efficiently obtain and manage literature citations. You are welcome to participate in project development on GitHub, provide suggestions, or report issues.
Related Links#
- GitHub Repository: get-bibtex
- Issue Feedback: Issues
- PyPI Page: get-bibtex
This article is synchronized and updated to xLog by Mix Space
The original link is https://liuyaowen.cn/posts/person/20241231