πͺ Bringing the Power of tidyverse to Pandas!ο
tidyversetopandas is a Python package designed for users familiar with Rβs tidyverse who are transitioning to Python. It bridges the syntax gap between R and Python by offering pandas equivalents to popular tidyverse functions. This package is particularly beneficial for data scientists and analysts who seek to leverage pandasβ robust capabilities with the familiar syntax of tidyverse.
Installation: installation
Documentation: https://tidyversetopandas.readthedocs.io/en/latest/example.html
Source code: https://github.com/UBC-MDS/TidyverseToPandas
Bug reports: https://github.com/UBC-MDS/TidyverseToPandas/issues
π Fitting into the Python Ecosystemο
While pandas is a powerful tool for data manipulation in Python, it can be challenging for those accustomed to Rβs tidyverse syntax. tidyversetopandas is unique in its approach to blend these two worlds. In the Python ecosystem, tidyversetopandas fits alongside packages that aim to incorporate tidyverse-like functionality into Pythonβs data manipulation landscape, predominantly with pandas. The goal is to make pandas more accessible to those accustomed to tidyverse syntax.Two notable packages in this domain are tidypandas and siuba. Both of them represent, similar to tidyversetopandas, efforts to bridge the gap between Rβs tidyverse approach and Pythonβs pandas library, offering users familiar with Rβs data manipulation tools a more comfortable transition to Pythonβs data science ecosystem.
π Key Functions:ο
mutate():Similar to its tidyverse counterpart, this function allows for the creation of new columns or modification of existing ones in a DataFrame.filter():Enables row-wise filtering, making it easier to sift through DataFrame based on specified conditions.select():Facilitates the selection of specific columns in a DataFrame, streamlining data manipulation and analysis.arrange():Offers sorting capabilities for DataFrame based on one or multiple columns.
βοΈ Installationο
pip install tidyversetopandas
π Usageο
Lets try to use tidyversetopandas.
Import packageο
Import the package into your Python environment after installation:
from tidyversetopandas import tidyversetopandas as ttp
Loading Dataο
Begin by loading your data into a pandas dataframe. This package assumes that you have a dataframe ready for manipulation named df.
Mutateο
Use mutate to create new columns or modify existing ones. We can do this by writing the expression we want as a string.
df = ttp.mutate(df, "b=b*2")
Filterο
The filter function is used to subset dataframes based on specified conditions. For instance, to select rows where βAβ is greater than 1 and βBβ is less than 6
df = ttp.filter(df, "A > 1 and B < 6")
Arrangeο
Sort your dataframe with arrange. You can sort by multiple columns and specify ascending or descending order. For example, to sort by βAβ in ascending order and then by βCβ
df = ttp.arrange(df, True, "A", "C")
Selectο
To keep only certain columns, use the select function. For example, to keep only the column βAβ
df = ttp.select(df, "A")
π Developer Guideο
π οΈ Installation in Development Modeο
Clone the repository and navigate to the project root directory.
Create a virtual environment and activate it.
conda env create -f environment.yml
conda activate tidyversetopandas
Make sure
poetryis installed. If not, install it here. Once installed, run the following command to install the package in development mode.
poetry install
β Testingο
To run the tests, use the following command:
pytest tests/
To run tests with coverage, use the following command:
pytest tests/ --cov=tidyversetopandas
To view the coverage report, use the following command:
pytest --cov=tidyversetopandas --cov-report html tests/
This will create a htmlcov directory containing the coverage report in HTML format. Open the index.html file in this directory with a web browser to view the detailed coverage report.
π€ Contributingο
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Β©οΈ Licenseο
tidyversetopandas was created by Thomas,Sophia,Lily,Nando. It is licensed under the terms of the MIT license.
π₯ Contributorsο
Creditsο
tidyversetopandas was created with cookiecutter and the py-pkgs-cookiecutter template.