In this tutorial, we have compiled a variety of Python methods to list all files in a directory such as os.walk()
, os.listdir()
, os.scandir()
, glob(), and a recursive function. Each method includes a self-explanatory example so that you can grasp them easily. Towards the end, you’ll find a table comparing these methods and advising which is the most suitable.
You may need such techniques, especially in Selenium Python automation or working with configuration/log files. Python comes with the default OS module that enables several functions to interact with the file system. As mentioned above, it has a walk() method which lists all files inside a directory in Python. Besides, it has another function listdir()
that finds files on the specified path.
Similarly, Python’s Glob module has a glob() method that checks for the specified files in the current directory. Let’s now have a deeper look at each of the methods for listing all files in a directory.
Also Check: Loop Through Files in a Directory using Python
Python Methods to List All Files in a Directory
Here, we are demonstrating functions that help traverse the file system and search for the files present.
Os.walk() method
It gathers the file names present in a directory by traversing the dir in either top-down or bottom-up. It returns a tuple of the following three:
Root: Gets only the folders from the input.
Dirs: Gets sub-directories from the root.
Files: Gets all files from the given root and directories.
Find all text files in dirs and subdirs
Below is the sample Python code listing all files in given directories and sub-directories.
import os location = 'c:/test/temp/' files_in_dir = [] # r=>root, d=>directories, f=>files for r, d, f in os.walk(location): for item in f: if '.txt' in item: files_in_dir.append(os.path.join(r, item)) for item in files_in_dir: print("file in dir: ", item)
After execution, the following is the result:
c:/test/temp/notes/readme.txt c:/test/temp/release/artifact_list.txt c:/test/temp/dist/doc/readme.txt c:/test/temp/dist/samples/sample.txt
List all dirs under given dirs and subdirs
Check the below Python code to find and print all dirs under the given dir/subdir
.
import os location = 'c:/test/temp/' dirs_in_dir = [] # r=>root, d=>directories, f=>files for r, d, f in os.walk(location): for item in d: if '.txt' in item: dirs_in_dir.append(os.path.join(r, item)) for item in dirs_in_dir: print("Dirs under dir: ", item)
After execution, the following is the result:
c:/test/temp/notes/ c:/test/temp/release/ c:/test/temp/dist/ c:/test/temp/dist/doc/ c:/test/temp/dist/samples/
Glob.glob() method
Occasionally, we may have to iterate a directory to find files with names matching a pattern. In such a case, the Python glob module can help capture the list of files in a given directory with a particular extension.
glob() function
This function fetches a list of files filtered based on the given pattern in the pathname. We can take a pathname which is absolute as well as relative. The wild cards such as * and ? are also allowed symbols.
Another parameter, recursive is off (false) by default. If its value is True, then this function searches inside all subdirectories of the current directory and finds files having the desired pattern
List all files of “.py” type
The following Python code will fetch the list of all files in the current directory having the “.py” extension.
import glob location = 'c:/test/temp/' fileset = [file for file in glob.glob(location + "**/*.py", recursive=True)] for file in fileset: print(file)
After execution, the following is the result:
c:/test/temp/notes/get_sample.py c:/test/temp/release/test1.py c:/test/temp/dist/doc/core.py c:/test/temp/dist/samples/first_sample.py
Read about Python glob in more detail.
Get all dirs in a specified dir and subdirs
import glob location = 'c:/test/temp/' folderset = [folder for folder in glob.glob(location + "**/", recursive=True)] for folder in folderset: print(folder)
After running the above code, the following is the result:
c:/test/temp/notes/ c:/test/temp/release/ c:/test/temp/dist/ c:/test/temp/dist/doc/ c:/test/temp/dist/samples/
Os.listdir() method to list text files
It gives a list including the names of the files in the directory specified in the location (path). The list happens to be in random order. It excludes the ‘.’ and ‘..’ if they are available in the input folder.
import os location = 'c:/test/temp/' for file in os.listdir(location): if file.endswith(".txt"): print(os.path.join(location, file))
After execution, the following is the result:
c:/test/temp/notes/readme.txt c:/test/temp/release/artifact_list.txt c:/test/temp/dist/doc/readme.txt c:/test/temp/dist/samples/sample.txt
To learn Python in a step-by-step manner, read this Python tutorial.
Using Path.iterdir()
The function Path.iterdir()
is present in Pathlib
module. It is a newer module in Python 3.4 that provides a more robust way to work with paths. The Path.iterdir()
method can list the files and directories in a directory.
Let’s find out how the Python module works to achieve this.
from pathlib import Path # Get a list of all the files in the current directory files = Path().iterdir() # Print the list of files for file in files: print(file)
Using the os.scandir()
function
The os.scandir()
is also a relatively new function that returns a generator of directory entries. This can be useful if you need to iterate over the files in a directory in a more efficient way.
Go through the below code which demonstrates how to use os.scandir()
to list all files in a directory.
import os def list_files_with_details(directory): """Lists all of the files in a directory and prints some additional details about each file, such as the file size and the file type.""" # Get a generator of directory entries entries = os.scandir(directory) # Iterate over the directory entries and print some details about each file for entry in entries: # Get the file size file_size = entry.stat().st_size # Get the file type file_type = entry.stat().st_mode # Print the file name, file size, and file type print(f"{entry.name}: {file_size} bytes ({file_type})") # List the files in the current directory and print some additional details about each file list_files_with_details(".")
The above code will print a list of all the files in the current directory, along with the file size and type. You can modify it to print any additional information about the files. For example, you could print the file creation date, the file’s last modified date, or the file permissions.
Also Check: Read File Line by Line in Python
Python recursive method to list all files in a directory
In Python, you can utilize a recursive function to list the files in a directory along with the subdirectories. The following function will achieve it recursively:
import os def list_files_recursively(directory): """Lists all of the files in a directory and all of its subdirectories, recursively.""" files = [] # Iterate over the files and directories in the directory for entry in os.scandir(directory): # If the entry is a directory, recursively list the files in that directory if entry.is_dir(): files += list_files_recursively(entry.path) # If the entry is a file, add it to the list of files elif entry.is_file(): files.append(entry.path) return files # Get a list of all the files in the current directory and all of its subdirectories files = list_files_recursively(".") # Print the list of files for file in files: print(file)
Comparing different methods
Here is a table to compare all methods, their performance, how one can decide which one to choose, and which is best:
Method | Performance | How to choose |
---|---|---|
os.listdir() | Simple and efficient | If you just need to list the files in a directory |
Pathlib.Path.iterdir() | More robust and flexible than os.listdir() | In case you need to use more advanced features of the Pathlib module, such as handling symbolic links. |
glob.glob() | Powerful and flexible for listing files based on patterns | When you need to list files based on a pattern, such as all Python files in a directory. |
os.scandir() | More efficient than os.listdir() for iterating over the files in a directory | If you need to iterate over the files in a directory more efficiently. |
os.walk() | Recursively lists the files and directories in a directory and all of its subdirectories | When you need to list the files and directories in a directory and all its subdirectories. |
Recursive function | Can be used to list the files in a directory and all of its subdirectories | If you need to list the files in a directory and all its subdirectories, plus you want more control over the recursion. |
The performance of these methods will depend on the number of files in the directory and the complexity of the search pattern. In general, the os.scandir()
function will be the most efficient to iterate over the files in a directory. It does this by avoiding unnecessary system calls and by caching file information.
The performance of the glob
module will depend on the complexity of the search pattern. If the search pattern is simple, then it will be relatively fast.
Lastly, our site needs your support to remain free. Share this post on social media (Linkedin/Twitter) if you gained some knowledge from this tutorial.
Enjoy coding,
TechBeamers.