Scenario
“Your company needs to learn about the files located on various machines. You have been asked to build a script that extracts information such as the name and size about the files in the current working directory and stores it in a list of dictionaries.” -LUIT, Python2
Automation scripting is a game-changer and the LUIT-Python2 GitHub repository provides a robust solution. This repository offers a Python-based tool that simplifies tasks like network provisioning, application deployment, and system configuration.
Why Automate Infrastructure Management?
Efficiency: By automating repetitive tasks, you save time and reduce the risk of human error.
Consistency: Each infrastructure deployment is identical, ensuring that “it works on my machine” never becomes an issue.
Scalability: Automated scripts can scale effortlessly, provisioning multiple environments with just a few commands.
Maintainability: Once an automated script is written, it can be reused, updated, and version-controlled.
Getting Started
Before diving into the repository, make sure you have Python 2.7 (or compatible) installed on your machine. You will also need a basic understanding of Linux-based systems, as many of the scripts are tailored for that environment.
Clone the Repository: Start by cloning the LUIT-Python2 repository from GitHub:
git clone https://github.com/Judewakim/LUIT-Python2.git
Install Dependencies: The repository uses a set of Python packages to handle various tasks. You can install them using pip:
cd LUIT-Python2
pip install -r requirements.txt
Writing Python
To break down the essential components of the data_extraction.py file, lets start with the imports. This code requires to import information about the operating system and the datetime.
import osimport timeimport os import timeimport os import time
Enter fullscreen mode Exit fullscreen mode
Next, we begin defining the first function of the code. This function will collect all the information that we will need. Remember, the goal of this code is to “build a script that extracts information such as the name and size about the files in the current working directory” so we will collect all the information about the working directory (or whatever directory is specified) and later we will present this information similar to the ls -al command in the Linux command line. That function will look like this:
def get_files_info(path='.'): # Function that collects file details, defaulting to current directoryfiles_info = [] # List to store file detailsfor root, _, files in os.walk(path): # Recursively traverse directories and filesfor filename in files:file_path = os.path.join(root, filename) # Construct the full file pathfile_stat = os.stat(file_path) # Get file details/statisticsfiles_info.append({ # Store file details in a dictionary'name': filename, # File name'path': file_path, # Full file path'size_bytes': file_stat.st_size, # File size in bytes'last_modified': time.ctime(file_stat.st_mtime), # Last modified time'permissions': oct(file_stat.st_mode)[-3:], # File permissions in octal format (last 3 digits)})return files_info # Return the list of file detailsdef get_files_info(path='.'): # Function that collects file details, defaulting to current directory files_info = [] # List to store file details for root, _, files in os.walk(path): # Recursively traverse directories and files for filename in files: file_path = os.path.join(root, filename) # Construct the full file path file_stat = os.stat(file_path) # Get file details/statistics files_info.append({ # Store file details in a dictionary 'name': filename, # File name 'path': file_path, # Full file path 'size_bytes': file_stat.st_size, # File size in bytes 'last_modified': time.ctime(file_stat.st_mtime), # Last modified time 'permissions': oct(file_stat.st_mode)[-3:], # File permissions in octal format (last 3 digits) }) return files_info # Return the list of file detailsdef get_files_info(path='.'): # Function that collects file details, defaulting to current directory files_info = [] # List to store file details for root, _, files in os.walk(path): # Recursively traverse directories and files for filename in files: file_path = os.path.join(root, filename) # Construct the full file path file_stat = os.stat(file_path) # Get file details/statistics files_info.append({ # Store file details in a dictionary 'name': filename, # File name 'path': file_path, # Full file path 'size_bytes': file_stat.st_size, # File size in bytes 'last_modified': time.ctime(file_stat.st_mtime), # Last modified time 'permissions': oct(file_stat.st_mode)[-3:], # File permissions in octal format (last 3 digits) }) return files_info # Return the list of file details
Enter fullscreen mode Exit fullscreen mode
Now, we have a function that collects the file name, file path, file size, last modification time, and file permissions of each file in the path. The next thing to do is create another function that will display all this collected data in the way we want it. That second function will look like this:
def print_ll_view(files_info): # Function to print file details in 'll' view formatfor file in files_info: # Loop through the list of file dictionariesprint(f"{file['permissions']} {file['size_bytes']} {file['last_modified']} {file['path']}") # Print file detailsdef print_ll_view(files_info): # Function to print file details in 'll' view format for file in files_info: # Loop through the list of file dictionaries print(f"{file['permissions']} {file['size_bytes']} {file['last_modified']} {file['path']}") # Print file detailsdef print_ll_view(files_info): # Function to print file details in 'll' view format for file in files_info: # Loop through the list of file dictionaries print(f"{file['permissions']} {file['size_bytes']} {file['last_modified']} {file['path']}") # Print file details
Enter fullscreen mode Exit fullscreen mode
Lastly, we will call this functions and add a bit of user interaction to make the program more streamlined for the user. That last bit of code is set to only run when the Python is run directly and cannot be called from another file. This is what that looks like:
if __name__ == "__main__": # Run the script only if executed directlypath = input("Enter the directory path (press Enter to use current directory): ") or "." # Prompt user for path, default to current directoryfiles_data = get_files_info(path) # Get file details for the specified pathprint("\nLinux CLI 'll' View:") # Print headerprint_ll_view(files_data) # Display file detailsif __name__ == "__main__": # Run the script only if executed directly path = input("Enter the directory path (press Enter to use current directory): ") or "." # Prompt user for path, default to current directory files_data = get_files_info(path) # Get file details for the specified path print("\nLinux CLI 'll' View:") # Print header print_ll_view(files_data) # Display file detailsif __name__ == "__main__": # Run the script only if executed directly path = input("Enter the directory path (press Enter to use current directory): ") or "." # Prompt user for path, default to current directory files_data = get_files_info(path) # Get file details for the specified path print("\nLinux CLI 'll' View:") # Print header print_ll_view(files_data) # Display file details
Enter fullscreen mode Exit fullscreen mode
Running the Program
The program is created and ready to be used. Navigate to the location where the program is stored. If you cloned it from my Github repository the location should be LUIT-Python2 . Once there, you can run the program use the command python .\data_extraction.py
.
At this point, you have a Python program that will display the files from whatever location you specify and will display those files in Linux ls format for easy readability. This program can be repeated with whatever file path you need.
You can modify this program to fit your company’s needs by cloning it locally or forking it on Github.
This entire project is available on GitHub
Originally published on Medium
Find me on Linkedin
原文链接:Master Python Automation: Extract and Display File Info Like a Pro
暂无评论内容