# PyMurTree [![GitHub](https://img.shields.io/github/license/MurTree/pymurtree?)](https://github.com/MurTree/pymurtree/blob/master/LICENSE) ![GitHub Workflow Status (with event)](https://img.shields.io/github/actions/workflow/status/MurTree/pymurtree/wheels.yml) [![codecov](https://codecov.io/gh/MurTree/pymurtree/branch/develop/graph/badge.svg?token=2TKHE7IN49)](https://codecov.io/gh/MurTree/pymurtree) ![GitHub repo size](https://img.shields.io/github/repo-size/MurTree/pymurtree) Warning: > This repository is under active development and currently in a pre-alpha state. > The current version only supports Linux. PyMurTree is a Python wrapper for the [MurTree project](https://github.com/MurTree/murtree). The MurTree algorithm constructs optimal classification trees that minimize the misclassification score of a given dataset while respecting constraints on depth and number of feature nodes. The sparse objective, which penalizes each node added in the tree, is also supported. --- ## **Citation** This package is based on the methods and algorithms described in: > **"MurTree: Optimal Decision Trees via Dynamic Programming and Search"** > by Emir Demirović, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Peter J. Stuckey > Journal of Machine Learning Research (JMLR), 2022. > [Available online](https://jmlr.org/papers/v23/20-520.html) --- ## Installation Before attempting to install pymurtree, make sure your system has the following software available: Python version 3.7 or higher and pip. ### Install from source using `pip` ```bash git clone https://github.com/MurTree/pymurtree.git cd pymurtree # Optional: build pymurtree within a virtual environment # python3 --version # check your python version # sudo apt install python-venv # python3 -m venv env # . env/bin/activate # Install pip install . # To install the dev version: # pip install .[dev] ``` ### Building and running the tests For building and running the tests, you will need the following software: pytest, a C++ compiler, and CMake version 3.14 or higher. ```bash # Run the Python tests pytest # Run the C++ tests cd tests/cpptests mkdir build cd build cmake .. make ctest ``` ## Usage The full [API specification](https://murtree.github.io/pymurtree/api_reference.html) is available in the repo's documentation. ### Example After installing pymurtree you can use it in your Python code by importing the package. Here's an example of how to build a decision tree classifier from a training dataset, make predictions and export the tree for visualization with [graphviz](https://graphviz.org/): ```python import pymurtree import numpy # Create training data x = numpy.array([[0, 1, 0, 1], [1, 0, 0, 1], [1, 1, 0, 0]]) # features y = numpy.array([5, 5, 4]) # labels # Build tree classifier model = pymurtree.OptimalDecisionTreeClassifier() model.fit(x, y, max_depth=4, max_num_nodes=5, time=400) # Predict labels for a new set of features ft = numpy.array([[1, 0, 0, 1], [0, 0, 1, 1], [1, 0, 1, 0]]) labels = model.predict(ft) # Visualize tree model.export_text() # Export tree in DOT format for visualization with graphviz model.export_dot() ``` ### Datasets A collection of datsets compatible with pymurtree is available in [https://github.com/MurTree/murtree-data](https://github.com/MurTree/murtree-data) ## Contributing There are different ways in which you can contribute to pymurtree: - Try the package and let us know if its useful for your work. - If its useful, please star the repo. - Report a bug or request a feature by opening an issue. - Contribute to the codebase by opening a pull request. - Currently pymurtree works only on Linux, any help picking open issues and fixing them for other platforms is welcome. ## License [MIT LICENSE](LICENSE)