code-travail/code-travail-extract.py

15 lines
339 B
Python

from pypdf import PdfReader
reader = PdfReader("/home/alban/code/wokegpt/sources/code-travail-2022.pdf")
full_text = ""
for page in reader.pages:
text= page.extract_text()
full_text += "\n"
full_text += text
with open("/home/alban/code/wokegpt/sources/code-travail-2022.txt", "a") as myfile:
myfile.write(full_text)