Skip to main content
Open In ColabOpen on GitHub

MathPixPDFLoader

Inspired by Daniel Gross's snippet here: https://gist.github.com/danielgross/3ab4104e14faccc12b49200843adab21

Overviewโ€‹

Integration detailsโ€‹

ClassPackageLocalSerializableJS support
MathPixPDFLoaderlangchain_communityโœ…โŒโŒ

Loader featuresโ€‹

SourceDocument Lazy LoadingNative Async Support
MathPixPDFLoaderโœ…โŒ

Setupโ€‹

Credentialsโ€‹

Sign up for Mathpix and create an API key to set the MATHPIX_API_KEY variables in your environment

import getpass
import os

if "MATHPIX_API_KEY" not in os.environ:
os.environ["MATHPIX_API_KEY"] = getpass.getpass("Enter your Mathpix API key: ")

To enable automated tracing of your model calls, set your LangSmith API key:

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

Installationโ€‹

Install langchain_community.

%pip install -qU langchain_community

Initializationโ€‹

Now we are ready to initialize our loader:

from langchain_community.document_loaders import MathpixPDFLoader

file_path = "./example_data/layout-parser-paper.pdf"
loader = MathpixPDFLoader(file_path)
API Reference:MathpixPDFLoader

Loadโ€‹

docs = loader.load()
docs[0]
print(docs[0].metadata)

Lazy Loadโ€‹

page = []
for doc in loader.lazy_load():
page.append(doc)
if len(page) >= 10:
# do some paged operation, e.g.
# index.upsert(page)

page = []

API referenceโ€‹

For detailed documentation of all MathpixPDFLoader features and configurations head to the API reference: https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.pdf.MathpixPDFLoader.html


Was this page helpful?

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy