How To Save Pandas DataFrame As CSV File

To save Panda's DataFrame in to CSV or Excel file, use following commands...

  1. df.to_csv('data.csv', index=False)
  2. df.to_excel('data.xls', index=False)

In this notebook, we will learn about saving Pandas Dataframe in to a CSV file.

For this excercise we will use dummy data.

In [1]:
import pandas as pd

Let us first create a Python list of dictionaries where each dictionary contains information about a trading stock.

In [2]:
data = [{'tickr':'intc', 'price':45, 'no_of_employees':100000}, {'tickr':'amd', 'price':85, 'no_of_employees':20000}]

Let us first convert above list to Pandas DataFrame using pd.DataFrame method.

In [3]:
df = pd.DataFrame(data)

df is Pandas Dataframe. Let us print it.
To learn more about Pandas and Dataframes, checkout following notebooks...
https://www.nbshare.io/notebooks/pandas/

In [4]:
print(df)
  tickr  price  no_of_employees
0  intc     45           100000
1   amd     85            20000

we can save this data frame using df.to_csv method as shown below. Note the first argument in below command is the file name and second argument 'index=False' will restrict Pandas from inserting row (or index) numbers for each row.

In [5]:
df.to_csv('data.csv', index=False)

Above command shoulde create a 'data.csv' file in our current directory. Let us check that using 'ls' command.

In [6]:
ls -lrt data.csv
-rw-r--r-- 1 root root 56 May 15 00:40 data.csv

yes indeed the file is there. Let us check the contents of this file using Unix 'cat' command.
Note i am running this notebook on Linux machine that is why i am able to run these unix Commands from the Jupyter notebook.

In [7]:
cat data.csv
tickr,price,no_of_employees
intc,45,100000
amd,85,20000

As we see above, the content is comma separated list of values. Instead of comma, we can use any other separator using the "sep" argument.

In [11]:
df.to_csv('data.csv', index=False,sep="|")
In [12]:
cat data.csv
tickr|price|no_of_employees
intc|45|100000
amd|85|20000

Note: There are lot of options which df.to_csv can take. Checkout the complete list below...

df.to_csv(
path_or_buf: 'FilePathOrBuffer[AnyStr] | None' = None,
sep: 'str' = ',',
na_rep: 'str' = '',
float_format: 'str | None' = None,
columns: 'Sequence[Hashable] | None' = None,
header: 'bool_t | list[str]' = True,
index: 'bool_t' = True,
index_label: 'IndexLabel | None' = None,
mode: 'str' = 'w',
encoding: 'str | None' = None,
compression: 'CompressionOptions' = 'infer',
quoting: 'int | None' = None,
quotechar: 'str' = '"',
line_terminator: 'str | None' = None,
chunksize: 'int | None' = None,
date_format: 'str | None' = None,
doublequote: 'bool_t' = True,
escapechar: 'str | None' = None,
decimal: 'str' = '.',
errors: 'str' = 'strict',
storage_options: 'StorageOptions' = None,
) -> 'str | None'