Python Underscore

The underscore (_) is not a simple character in Python. While in most languages it is only used to name functions and variables in the snake-case, in Python it is much more widely used. Most likely, you have already seen constructions for _ in range (10) or __init __ (self).

In this chapter we’ll discuss the following five underscore patterns and naming conventions, and how they affect the behavior of your Python programs:

  1. To store the last value in REPL (Read, Evaluate, Print, Loop).
  2. To ignore unimportant values.
  3. To assign a special value to a function or variable.
  4. As an alias for internationalization and localization functions.
  5. Splitting numbers into digits.

And now let's look at each list item in more detail.

Underscore In Python REPL

The interpreter in the interactive mode stores the last calculated value in a special variable _. This feature first appeared in CPython, but is now supported by all major interpreters.

>>> 10 

10

>>> _ 

10

>>> _ * 3 

30

>>> _ * 20 

600

Python Underscore Variable

Underscoring is also used to ignore values. If you do not want to use some value - assign it to the variable _.

In the following code example, we’re unpacking number from tuple in separated variables.

But for example, we are only interested in the first and last values. However, in order for the unpacking expression to succeed, I need to assign all values contained in the tuple to variables.

# ignore when unpacking
x, _, _, y = (1, 2, 3, 4) # x = 1, y = 4
# ignore multiple values,Python 3.x only
x, *_, y = (1, 2, 3, 4, 5) # x = 1, y = 5
# ignore the index
for _ in range(10):
do_something()
# or some specific value
for _, val in list_of_tuple:
do_something()

Underscoring is most often used in naming. PEP8 describes 4 cases of using underscores in names:

Single Leading Underscore: “_var”

When it comes to variable and method names, the single underscore prefix has a meaning by convention only. It’s a hint to the programmer — it means what the Python community agrees it should mean, but it does not affect the behavior of your programs.

The underscore prefix is meant as a hint to tell another programmer that a variable or method starting with a single underscore is intended for internal use. This convention is defined in PEP 8, the most commonly used Python code style guide.

Take a look at the following example:

class Base:
def __init__(self):
self.var = 'var'
self._var = 'var with leading underscore'
def method(self):
return 'method is called'
def _method(self):
return 'method with leading underscore is called'

What is going to happen if you instantiate Base class and try to access var, _var attributes defined in its __init__ constructor? And what about method and _method ?

Let’s find out:

>>> base = Base()
>>> base.var

'var'

>>> base._var

'var with leading underscore'

>>> base.method()

'method is called'

>>> base._method()

'method with leading underscore is called'

As you can see, the leading single underscore in _var and _method attributes did not prevent us from “reaching into” the class and accessing the value of that variable.

However, leading underscores do impact how names get imported from modules. All names starting with an underscore will be ignored in from module import *

Let’s create a file (module) my_string_formatter.py with following code:

# my_string_formatter.py
def to_lower(s: str) -> str:
return s.capitalize()
def _to_upper(s: str) -> str:
return s.upper()

Now, let's figure out what's going to happen if we will call functions with a wildcard import:

>>> from my_string_formatter.py import *
>>> to_lower('TEST')

'test'

>>> _to_upper('test')

NameError: "name '_to_upper' is not defined"

Python doesn’t import names with a leading underscore (unless the module defines an __all__ list that overrides this behavior by adding __all__ = ['to_lower', '_to_upper'])

By the way, wildcard imports should be avoided as they make it unclear which names are present in the namespace.

Single Trailing Underscore: “var_”

Such names are used to avoid conflicts with keywords in Python by convention. You should not normally use them. This convention is defined and explained in PEP 8.

# avoid conflict with the keyword 'class'
Tkinter.Toplevel(master, class_ = 'ClassName')
# avoid conflict with the standard type 'list'
list_ = List.objects.get(1)

Python Double Underscore

The naming patterns we’ve covered so far receive their meaning from agreed-upon conventions only. With Python class attributes (variables and methods) that start with double underscores, things are a little different.

Python Name Mangling

A double underscore prefix causes the Python interpreter to rewrite the attribute name in order to avoid naming conflicts in subclasses.

How it works? Let’s create class with following attributes:

class Test:
def __init__(self):
self.num1 = 10
self._num2 = 20
self.__num3 = 30

Let’s take a look at the object attributes with build-in dir() function:

>>> test = Test()
>>> dir(test)

['_Test__num3', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_num2', 'num1']

Here we are getting list of object attributes. At the end of list we could noticed our _num2 and num1 attributes:

self.num1 variable appears unmodified as num1 in the attribute list.

self._num2 behaves the same way - it shows as _num2. Like we said before, leading underscore is just a convention.

So what happened to __num3 ?

If you look closely, you would notice _Test__num3 attribute at the start of the list. This is the name mangling that the Python interpreter applies. It works that way to protect the variable from overriding in subclasses.

Name mangling also applies to method names (and frankly, to all names that start with double leading underscore in class context):

class NameManglingMethod:
def __method(self):
return 'name mungling method'
def call_it(self):
return self.__method()
>>> NameManglingMethod.__method()

AttributeError: "NameManglingMethod object has no attribute '__method'"

>>> NameManglingMethod.call_it()

'name mungling method'

Double Leading and Trailing Underscore: "__var__"

Python Functions Starting With Underscore

So-called special (magical) methods. For example, __init__, __len__. Some of them implements syntactic features, some store special data: __file__ indicates the path of the code file, __eq__ is executed when calling the expression a == b.

Of course, the user can create their own methods:

class Base:
def __init__(self):
pass
def __custom__(self): # user custom 'magical' method
pass

Python Variables With Leading and Trailing Underscores

Variables surrounded by a double underscore prefix and postfix are left unscathed by the Python interpreter:

class A:

    def __init__(self):
self.__var__ = 'var'
>>> a = A()
>>> a.__var__

'var'

However, names that have both leading and trailing double underscores are reserved for special use in the language. This rule covers things like __init__ for object constructors, or __call__ to make objects callable etc.

Python Underscore Internationalization and Localization

This is just an agreement on the names of these functions, they do not affect the syntax. This tradition came from C, and the built-in gettext module used to localize. It's used is like in Django, the most popular web framework.

# official docs - https://docs.python.org/3/library/gettext.html

import gettext
gettext.bindtextdomain(
'myapplication',
'path/to/my/language/directory'
)
gettext.textdomain('myapplication')
_ = gettext.gettext

print(_('This is translatable string.'))

Python Underscore to Spit Numbers into Digits

This feature is quite new, it was added only in Python 3.6. You can now use underscores to separate numbers, which improves code overview.

dec_base = 1_000_000
bin_base = 0b_1111_0000
hex_base = 0x_1234_abcd
print(dec_base) # 1000000
print(bin_base) # 240
print(hex_base) # 305441741