Advice for Python Beginners

Software Engineering 1899 views

In addition to my post about resources for learning Python, here are a few tips for Python beginners:

Python Standards (aka PEP 8)

Python is continually enhanced publicly via Python Enhancement Proposals. PEP 8 is the style guide for Python. Since most code is read / maintained / modified by people, it's important to strive to make the code as readable / understandable / maintainable as possible. Adhering to the official Python style guide helps with that.

The PyCharm IDE will actually check your code against PEP 8 and give you warnings about violations. PyCharm uses a linter which can find many code smells.

Naming Conventions

Part of PEP 8 also deals with naming conventions. Most Python code should be snake case, not camel case. Python class names are title case or proper case. Here are some examples:

Kind Example Case
Constant PIPE = '|' Upper
Class name class = ViewingMapper: Title or Proper
Module name import viewing_mapper Snake
Function name def normalize_data(): Snake
Argument name def run(start_date, end_date): Snake
Variable name viewing_mapper = ViewingMapper(); Snake

Private variables

Since Python is a dynamic / interpreted language, it doesn't really support truly private scope. However, the convention is to precede private-like variables with an underscore. For example:

self._s3 = boto3.resource('s3')

Best Practices

Class Usage

I noticed the code you attached had wrapped the functionality in a class, Bucket. In this particular code, I would probably skip creating a Bucket class, since the SDK is sufficiently effective. Two of several reasons for creating a class for something are to hide complexity, and increase usability. For example, if the SDK would've required a lot of setup or steps to use, a class would've been good to hide that complexity and make it easier to use. Another reason for creating a class might be to abstract the implementation details so that they could be changed in the future, or new implementations could be created (i.e. inheritance or composition). However, here we only plan on dealing with Amazon's S3 and the SDK is already pretty easy to use.

Instead of an Bucket class, you may want to consider a class for a specific task. This is sometimes known as a Command pattern, or Task, or Service class. For example, perhaps an S3Uploader class? Ultimately though, it's up to you, the author, to decide. I always try to make sure my code is easily readable, testable, and maintainable. Prefer succinctly named things over vaguely named ones. What does Bucket mean to someone not familiar with AWS? Would S3Bucket be a better name? Or even something task / action oriented, like S3Uploader?

Additional Design Patterns

The concept of creating a Class or object for a specific task is actually a well-known design pattern, the single responsibility principle. Or, the statement that a class should only be responsible for one part of the system. This helps with keeping the codebase clean, understandable, and it helps facilitate test-ability.

Here are a list of other terms and patterns I would encourage looking into:

Everything doesn't have to be Object-Oriented

The above are all helpful when considering Object-Oriented design. Though sometimes OO design may be an overkill or unnecessary. Resist the urge to treat every programming task as a nail looking for an Object-Oriented hammer, so to speak. Over reliance on OOD can sometimes lead to coupled / brittle code with too much boiler-plate code which only represents relationships. It's perfectly acceptable to write top-down, module-based solutions. Not everything has to be Class / Object driven. Treat OO as an available tool in your tool-belt; use it when it's the best technique for the task at hand.

Learn Functional Programming

Learn Functional programing too. It should be another tool available in your tool-belt. Python is not a purely functional language, however, you can make use of functional concepts in Python (lambda, map / reduce, etc.). Functional programming languages are usually better suited for parallelism / concurrency.

More Python Articles

See Also