Malevich#
Getting Started#
Welcome to Malevich — a platform for building ML-driven prototypes and iterating them to production. This page provides a brief overview of the platform’s capabilities that can be utilized from Python code or the command-line interface.
Explore more about building apps and assembling flows to start developing on Malevich. Check out the API reference for detailed insights into code functionalities.
If you wish to contribute to the Malevich package, please refer to the Contributing page.
Installation#
Malevich offers various tools for interacting with the platform, including the malevich
Python package. This package encompasses Malevich Square, CLI, Metascript, CI, and other minor tools. It is distributed via PyPI, allowing you to install it with pip
:
pip install malevich
Implement Your Idea#
Imagine having a brilliant product idea that requires utilizing services like OpenAI, making inferences on a pre-trained model from Hugging Face, or solving a common NLP task with SpaCy. With Malevich, you can turn your idea into a working prototype with just a few steps and an impressively small amount of code.
Let’s make it real! First, you need to install apps:
malevich install spacy openai
Following this, you may notice malevich.yaml
and malevich.secret.yaml
files appearing in your current directory. Similar to other package systems like pip
or npm
, Malevich keeps track of installed components to make your environment reproducible.
Once the apps are installed, you can begin integrating them into a flow. Create a file named flow.py
with the following content:
import os
import pandas as pd
from malevich import collection, flow
from malevich.openai import prompt_completion
from malevich.scrape import scrape_web
from malevich.spacy import extract_named_entities
prompt = """
You are a professional journalist. You've received
news containing many references to different people.
Your task is to understand the roles of these {entities}
and write a brief summary about them. The brief should
include the following information:
- Who is the person?
- What is their role in the news?
- What are the main events they are involved in?
Only include individuals for whom there is sufficient
information in the news. Otherwise, omit their names
entirely from the brief.
"""
@flow()
def write_brief():
# Scrape some news from the OpenAI blog.
links = collection(
'News Links',
df=pd.DataFrame(
[
'https://openai.com/blog/sam-altman-returns-as-ceo-openai-has-a-new-initial-board',
], columns=['link']
)
)
# The scraper app will retrieve information from websites specified by XPath —
# a query language that allows extracting information from markup documents.
text = scrape_web(
links,
config={
'spider': 'xpath',
'min_length': 100,
'max_results': 25,
'links_are_independent': True,
'max_depth': 1,
'spider_cfg': {
'components': [{
'key': 'news-text',
# Specify XPath query.
'xpath': "//div[@id='content']//text()"
}],
'output_type': 'text',
'include_keys': False
}
})
# Extract names of entities.
entities = extract_named_entities(
text, config={
'output_format': 'list',
'filter_labels': ['PERSON'],
}
)
# Write a brief about the news using OpenAI API
# to generate text based on our prompt and extracted names.
return text, prompt_completion(
entities,
openai_api_key=os.getenv('OPENAI_API_KEY'),
user_prompt=prompt
)
if __name__ == '__main__':
from malevich import CoreInterpreter
# Create a task for writing a brief.
pipeline = write_brief()
# Before running the task, interpret it to make
# the platform aware of dependencies and execution flow.
pipeline.interpret(
CoreInterpreter(
core_auth=('example', 'Welcome to Malevich!')
)
)
# Prepare task.
pipeline.prepare()
# Execute the task.
pipeline.run()
# Save results.
text, brief = pipeline.results()
text.get_df().to_csv('text.csv')
brief.get_df().to_csv('brief.csv')
As you can see, solving the task of extracting news, identifying people’s names, and writing a brief is simply a matter of configuring three apps correctly. Once you run the pipeline, you will find text.csv
and brief.csv
files in your current directory and can review the results.
Run the flow with this command (ensure your OPENAI_API_KEY
environment variable is set):
python flow.py
Make Your Own Apps#
We are continually expanding our list of available apps. If you find something missing that you need, we provide all the tools to create your own apps and optionally share them with the community. See Building Apps for more details.