Categories
General Python level 1 python

List Directories, Files, and Subdirectories with Python

Creating, updating, and interacting with files is an integral part of data pipelines. In order to interactively access files, we have to be able to list them. There are three ways to list the files in a directory for Python. In this post we’ll cover how to use the os module and three of its commands to list directories, subdirectories, and files in Python.

In this post we will:

  • Get all File and Subdirectory Names
  • Iterate Through Each Entry in the Directory
  • List directories, subdirectories, and files with Python

Get all File and Subdirectory Names

The first way to list all the files and subdirectory names in a directory is the os.listdir() function. This function returns a list of all the names of the entries in the directory as strings. This is the most basic way to list the subdirectories and file names in a directory.

This is the method we used to get all the filenames when combining files that start with the same word.

import os
 
root = "."
 
for obj in os.listdir(root):
    print(obj)

Iterate Through Each Entry in the Directory

A second way to get every entry in a directory is through the os.scandir() function. This function doesn’t return a list of strings, but rather a special iterator object of DirEntry objects. This method is more effective than os.listdir() when we need more than the name of the entries.

Each DirEntry object contains not only the name and path of the entry, but also whether the entry is a file or subdirectory. It also tells us if the entry is a symlink or not. The DirEntry object can make operating system calls so we could raise OSErrors while working with the results from an os.scandir() call.

import os
 
root = "."
 
for obj in os.scandir(root):
    print(obj)

List Directory, Subdirectory, and Files with Python

The os.scandir() and os.listdir() commands are great for getting the entries in one directory, but what if you need the subdirectory entries as well? This is where the third os library function that can iterate through directories comes in, os.walk().

The os.walk() function returns a generator. Each item in the generator is a tuple of size three. The first entry in the tuple is the path, the second is the list of subdirectories, and the third is the list of files. The walk function doesn’t just look in the current directory, it also recursively walks through every subdirectory. 

We can print all the files, including the ones nested in subdirectories, in a directory using the os.walk() function. All we have to do is loop through all the filenames in the list of files and print out the concatenation of the current path and the filename.

import os
 
root = "."
 
for path, subdirs, files in os.walk(root):
    for filename in files:
        print(os.path.join(path, filename))

Summary of Ways to List Directories, Subdirectories, and Files with Python

In this post we learned about three ways to list the files in a directory. The first two methods list the files and subdirectories in the current directory, and the last method goes into all the subdirectories as well. 

The listdir and scandir methods differ in the type of iterables they return, and the metadata attached to the objects in the iterable. On the other hand, `walk` returns a generator and not an iterable. It also contains tuples instead of objects or strings.

The listdir, scandir, and walk functions are the three built-in os functions that allow us to access file data. The `listdir` function is best used when we just need file names in the current directory. If we also need the entry types and more metadata, then scandir is a better option. Finally, if we need access to the subdirectory files as well, we should use the walk function.

Further Reading

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.