{"id":26751,"date":"2022-03-31T02:21:04","date_gmt":"2022-03-30T20:51:04","guid":{"rendered":"https:\/\/python-programs.com\/?p=26751"},"modified":"2022-03-31T02:21:04","modified_gmt":"2022-03-30T20:51:04","slug":"how-to-extract-images-from-a-pdf-in-python","status":"publish","type":"post","link":"https:\/\/python-programs.com\/how-to-extract-images-from-a-pdf-in-python\/","title":{"rendered":"How to Extract Images from a PDF in Python?"},"content":{"rendered":"

What is PDF?<\/strong><\/p>\n

PDFs are a popular format for distributing text. PDF is an abbreviation for Portable Document Format, and it utilizes the\u00a0.pdf<\/strong>\u00a0file extension. Adobe Systems designed it in the early 1990s.<\/p>\n

Reading PDF documents in Python can assist you in automating a wide range of operations.<\/p>\n

Let us now see how to extract images from a PDF file in python. For this purpose, we use the PyMuPDF and Pillow modules.<\/p>\n

Installation<\/strong><\/p>\n

\n
pip\u00a0install\u00a0PyMuPDF<\/pre>\n
pip\u00a0install\u00a0Pillow<\/pre>\n
Output:<\/strong><\/div>\n
Collecting PyMuPDF\r\nDownloading PyMuPDF-1.19.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014\r\n_x86_64.whl (8.8 MB)\r\n|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 8.8 MB 4.5 MB\/s \r\nInstalling collected packages: PyMuPDF\r\nSuccessfully installed PyMuPDF-1.19.6<\/pre>\n<\/div>\n
PyMuPDF module:<\/strong> PyMuPDF is binding for MuPDF in Python, a lightweight PDF viewer.<\/div>\n
Pillow Module:<\/strong> Pillow is a Python Imaging Library (PIL) that allows you to open, manipulate, and save images in a variety of formats.<\/div>\n

Extract Images from a PDF in Python<\/h2>\n

Approach:<\/strong><\/p>\n