2178 lines
801 KiB
Text
2178 lines
801 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d35ac8ce-0f92-46f5-9ba4-a46970f0ce19",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Cognee - Get Started"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "bd981778-0c84-4542-8e6f-1a7712184873",
|
||
"metadata": {
|
||
"editable": true,
|
||
"slideshow": {
|
||
"slide_type": ""
|
||
},
|
||
"tags": []
|
||
},
|
||
"source": [
|
||
"## Let's talk about the problem first\n",
|
||
"\n",
|
||
"### Large Language Models (LLMs) have become powerful tools for generating text and answering questions, but they still have several limitations and challenges. Below is an overview of some of the biggest problems with the results they produce:\n",
|
||
"\n",
|
||
"### 1. Hallucinations and Misinformation\n",
|
||
"- Hallucinations: LLMs sometimes produce outputs that are factually incorrect or entirely fabricated. This phenomenon is known as \"hallucination.\" Even if an LLM seems confident, the information it provides might not be reliable.\n",
|
||
"- Misinformation: Misinformation can be subtle or glaring, ranging from minor inaccuracies to entirely fictitious events, sources, or data.\n",
|
||
"\n",
|
||
"### 2. Lack of Contextual Understanding\n",
|
||
"- LLMs can recognize and replicate patterns in language but don’t have true comprehension. This can lead to responses that are coherent but miss nuanced context or deeper meaning.\n",
|
||
"- They can misinterpret multi-turn conversations, leading to confusion in maintaining context over a long dialogue.\n",
|
||
"\n",
|
||
"### 3. Inconsistent Reliability\n",
|
||
"- Depending on the prompt, LLMs might produce inconsistent responses to similar questions or tasks. For example, the same query might result in conflicting answers when asked in slightly different ways.\n",
|
||
"- This inconsistency can undermine trust in the model's outputs, especially in professional or academic settings.\n",
|
||
"\n",
|
||
"### 4. Inability to Access Real-Time Information\n",
|
||
"- Most LLMs are trained on data up to a specific point and cannot access or generate information on current events or emerging trends unless updated. This can make them unsuitable for inquiries requiring up-to-date information.\n",
|
||
"- Real-time browsing capabilities can help, but they are not universally available.\n",
|
||
"\n",
|
||
"### 5. Lack of Personalization and Adaptability\n",
|
||
"- LLMs do not naturally adapt to individual preferences or learning styles unless explicitly programmed to do so. This limits their usefulness in providing personalized recommendations or support.\n",
|
||
"\n",
|
||
"### 6. Difficulty with Highly Technical or Niche Domains\n",
|
||
"- LLMs may struggle with highly specialized or technical topics where domain-specific knowledge is required.\n",
|
||
"- They can produce technically plausible but inaccurate or incomplete information, which can be misleading in areas like law, medicine, or scientific research.\n",
|
||
"\n",
|
||
"### 7. Ambiguity in Response Generation\n",
|
||
"- LLMs might not always specify their level of certainty, making it hard to gauge when they are speculating or providing less confident answers.\n",
|
||
"- They lack a mechanism to say “I don’t know,” which can lead to responses that are less useful or potentially misleading."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d8e606b1-94d3-43ce-bb4b-dbadff7f4ca6",
|
||
"metadata": {},
|
||
"source": [
|
||
"## The next solution was RAGs \n",
|
||
"\n",
|
||
"#### RAGs (Retrieval Augmented Generation) are systems that connect to a vector store and search for similar data so they can enrich LLM response."
|
||
]
|
||
},
|
||
{
|
||
"attachments": {
|
||
"df72c97a-cb3b-4e3c-bd68-d7bc986353c6.png": {
|
||
"image/png": ""
|
||
}
|
||
},
|
||
"cell_type": "markdown",
|
||
"id": "23e74f22-f43c-4f03-afe0-b423cbaa412a",
|
||
"metadata": {},
|
||
"source": [
|
||
"\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1bf1fa3631dc03ed",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### The problem lies in the nature of the search. If you just find some keywords, and return one or many documents from vectorstore this way, you will have an issue with the the way you would use to organise and prioritise documents. \n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5029110f",
|
||
"metadata": {},
|
||
"source": [
|
||
""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "390e0d0805096f80",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Semantic similarity search is not magic\n",
|
||
"#### The most similar result isn't the most relevant one. \n",
|
||
"#### If you search for documents in which the sentiment expressed is \"I like apples.\", one of the closest results you get are documents in which the sentiment expressed is \"I don't like apples.\"\n",
|
||
"#### Wouldn't it be nice to have a semantic model LLMs could use?\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b900f830-8e9e-4272-b198-594606da4457",
|
||
"metadata": {},
|
||
"source": [
|
||
"# That is where Cognee comes in"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d3ae099a-1bbb-4f13-9bcb-c0f778d50e91",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Cognee assists developers in introducing greater predictability and management into their Retrieval-Augmented Generation (RAG) workflows through the use of graph architectures, vector stores, and auto-optimizing pipelines. Displaying information as a graph is the clearest way to grasp the content of your documents. Crucially, graphs allow systematic navigation and extraction of data from documents based on their hierarchy.\n",
|
||
"\n",
|
||
"#### Cognee lets you create tasks and contextual pipelines of tasks that enable composable GraphRAG, where you have full control of all the elements of the pipeline from ingestion until graph creation. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "785383b0-87b5-4a0a-be3f-e809aa284e30",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Core Concepts"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3540ce30-2b22-4ece-8516-8d5ff2a405fe",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concept 1: Data Pipelines"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7e47bae4-d27d-4430-a134-e1b381378f5c",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Most of the data we provide to a system can be categorized as unstructured, semi-structured, or structured. Rows from a database would belong to structured data, jsons to semi-structured data, and logs that we input into the system could be considered unstructured. To organize and process this data, we need to ensure we have custom loaders for all data types, which can help us unify and organize it properly."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2f9c9376-8c68-4397-9081-d260cddcbd25",
|
||
"metadata": {},
|
||
"source": [
|
||
""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7c87c5cf",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### In the example above, we have a pipeline in which data has been imported from various sources, normalized, and stored in a database. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "bd435d1d",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concept 2: Data Enrichment with LLMs"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "836d35ef",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### LLMs are adept at processing unstructured data. They can easily extract summaries, keywords, and other useful information from documents. We use function calling with Pydantic models to extract information from the unstructured data. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5bc1681c",
|
||
"metadata": {},
|
||
"source": [
|
||
""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c6f428a8",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### We decompose the loaded content into graphs, allowing us to more precisely map out the relationships between entities and concepts."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "34c2227f",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concept 3: Graphs"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7ec176f5",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Knowledge graphs simply map out knowledge, linking specific facts and their connections. When Large Language Models (LLMs) process text, they infer these links, leading to occasional inaccuracies due to their probabilistic nature. Clearly defined relationships enhance their accuracy. This structured approach can extend beyond concepts to document layouts, pages, or other organizational schemas."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ff454731",
|
||
"metadata": {},
|
||
"source": [
|
||
""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5b3b58d3",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concept 4: Vector and Graph Retrieval"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3555db8b",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Cognee lets you use multiple vector and graph retrieval methods to find the most relevant information."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d2d5e844",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Concept 5: Auto-Optimizing Pipelines"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6979a010",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Integrating knowledge graphs into Retrieval-Augmented Generation (RAG) pipelines leads to an intriguing outcome: the system's adeptness at contextual understanding allows it to be evaluated in a way Machine Learning (ML) engineers are accustomed to. This involves bombarding the RAG system with hundreds of synthetic questions, enabling the knowledge graph to evolve and refine its context autonomously over time. This method paves the way for developing self-improving memory engines that can adapt to new data and user feedback."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "074f0ea8-c659-4736-be26-be4b0e5ac665",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Demo time"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0587d91d",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### First let's define some data that we will cognify and perform a search on"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "df16431d0f48b006",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:53:59.351640Z",
|
||
"start_time": "2024-12-24T11:53:59.347420Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_position = \"\"\"Senior Data Scientist (Machine Learning)\n",
|
||
"\n",
|
||
"Company: TechNova Solutions\n",
|
||
"Location: San Francisco, CA\n",
|
||
"\n",
|
||
"Job Description:\n",
|
||
"\n",
|
||
"TechNova Solutions is seeking a Senior Data Scientist specializing in Machine Learning to join our dynamic analytics team. The ideal candidate will have a strong background in developing and deploying machine learning models, working with large datasets, and translating complex data into actionable insights.\n",
|
||
"\n",
|
||
"Responsibilities:\n",
|
||
"\n",
|
||
"Develop and implement advanced machine learning algorithms and models.\n",
|
||
"Analyze large, complex datasets to extract meaningful patterns and insights.\n",
|
||
"Collaborate with cross-functional teams to integrate predictive models into products.\n",
|
||
"Stay updated with the latest advancements in machine learning and data science.\n",
|
||
"Mentor junior data scientists and provide technical guidance.\n",
|
||
"Qualifications:\n",
|
||
"\n",
|
||
"Master’s or Ph.D. in Data Science, Computer Science, Statistics, or a related field.\n",
|
||
"5+ years of experience in data science and machine learning.\n",
|
||
"Proficient in Python, R, and SQL.\n",
|
||
"Experience with deep learning frameworks (e.g., TensorFlow, PyTorch).\n",
|
||
"Strong problem-solving skills and attention to detail.\n",
|
||
"Candidate CVs\n",
|
||
"\"\"\"\n"
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 1
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "9086abf3af077ab4",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:53:59.365410Z",
|
||
"start_time": "2024-12-24T11:53:59.363662Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_1 = \"\"\"\n",
|
||
"CV 1: Relevant\n",
|
||
"Name: Dr. Emily Carter\n",
|
||
"Contact Information:\n",
|
||
"\n",
|
||
"Email: emily.carter@example.com\n",
|
||
"Phone: (555) 123-4567\n",
|
||
"Summary:\n",
|
||
"\n",
|
||
"Senior Data Scientist with over 8 years of experience in machine learning and predictive analytics. Expertise in developing advanced algorithms and deploying scalable models in production environments.\n",
|
||
"\n",
|
||
"Education:\n",
|
||
"\n",
|
||
"Ph.D. in Computer Science, Stanford University (2014)\n",
|
||
"B.S. in Mathematics, University of California, Berkeley (2010)\n",
|
||
"Experience:\n",
|
||
"\n",
|
||
"Senior Data Scientist, InnovateAI Labs (2016 – Present)\n",
|
||
"Led a team in developing machine learning models for natural language processing applications.\n",
|
||
"Implemented deep learning algorithms that improved prediction accuracy by 25%.\n",
|
||
"Collaborated with cross-functional teams to integrate models into cloud-based platforms.\n",
|
||
"Data Scientist, DataWave Analytics (2014 – 2016)\n",
|
||
"Developed predictive models for customer segmentation and churn analysis.\n",
|
||
"Analyzed large datasets using Hadoop and Spark frameworks.\n",
|
||
"Skills:\n",
|
||
"\n",
|
||
"Programming Languages: Python, R, SQL\n",
|
||
"Machine Learning: TensorFlow, Keras, Scikit-Learn\n",
|
||
"Big Data Technologies: Hadoop, Spark\n",
|
||
"Data Visualization: Tableau, Matplotlib\n",
|
||
"\"\"\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 2
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "a9de0cc07f798b7f",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:53:59.372957Z",
|
||
"start_time": "2024-12-24T11:53:59.371152Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_2 = \"\"\"\n",
|
||
"CV 2: Relevant\n",
|
||
"Name: Michael Rodriguez\n",
|
||
"Contact Information:\n",
|
||
"\n",
|
||
"Email: michael.rodriguez@example.com\n",
|
||
"Phone: (555) 234-5678\n",
|
||
"Summary:\n",
|
||
"\n",
|
||
"Data Scientist with a strong background in machine learning and statistical modeling. Skilled in handling large datasets and translating data into actionable business insights.\n",
|
||
"\n",
|
||
"Education:\n",
|
||
"\n",
|
||
"M.S. in Data Science, Carnegie Mellon University (2013)\n",
|
||
"B.S. in Computer Science, University of Michigan (2011)\n",
|
||
"Experience:\n",
|
||
"\n",
|
||
"Senior Data Scientist, Alpha Analytics (2017 – Present)\n",
|
||
"Developed machine learning models to optimize marketing strategies.\n",
|
||
"Reduced customer acquisition cost by 15% through predictive modeling.\n",
|
||
"Data Scientist, TechInsights (2013 – 2017)\n",
|
||
"Analyzed user behavior data to improve product features.\n",
|
||
"Implemented A/B testing frameworks to evaluate product changes.\n",
|
||
"Skills:\n",
|
||
"\n",
|
||
"Programming Languages: Python, Java, SQL\n",
|
||
"Machine Learning: Scikit-Learn, XGBoost\n",
|
||
"Data Visualization: Seaborn, Plotly\n",
|
||
"Databases: MySQL, MongoDB\n",
|
||
"\"\"\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 3
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "185ff1c102d06111",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:53:59.961140Z",
|
||
"start_time": "2024-12-24T11:53:59.959103Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_3 = \"\"\"\n",
|
||
"CV 3: Relevant\n",
|
||
"Name: Sarah Nguyen\n",
|
||
"Contact Information:\n",
|
||
"\n",
|
||
"Email: sarah.nguyen@example.com\n",
|
||
"Phone: (555) 345-6789\n",
|
||
"Summary:\n",
|
||
"\n",
|
||
"Data Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.\n",
|
||
"\n",
|
||
"Education:\n",
|
||
"\n",
|
||
"M.S. in Statistics, University of Washington (2014)\n",
|
||
"B.S. in Applied Mathematics, University of Texas at Austin (2012)\n",
|
||
"Experience:\n",
|
||
"\n",
|
||
"Data Scientist, QuantumTech (2016 – Present)\n",
|
||
"Designed and implemented machine learning algorithms for financial forecasting.\n",
|
||
"Improved model efficiency by 20% through algorithm optimization.\n",
|
||
"Junior Data Scientist, DataCore Solutions (2014 – 2016)\n",
|
||
"Assisted in developing predictive models for supply chain optimization.\n",
|
||
"Conducted data cleaning and preprocessing on large datasets.\n",
|
||
"Skills:\n",
|
||
"\n",
|
||
"Programming Languages: Python, R\n",
|
||
"Machine Learning Frameworks: PyTorch, Scikit-Learn\n",
|
||
"Statistical Analysis: SAS, SPSS\n",
|
||
"Cloud Platforms: AWS, Azure\n",
|
||
"\"\"\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 4
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "d55ce4c58f8efb67",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:00.656495Z",
|
||
"start_time": "2024-12-24T11:54:00.654716Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_4 = \"\"\"\n",
|
||
"CV 4: Not Relevant\n",
|
||
"Name: David Thompson\n",
|
||
"Contact Information:\n",
|
||
"\n",
|
||
"Email: david.thompson@example.com\n",
|
||
"Phone: (555) 456-7890\n",
|
||
"Summary:\n",
|
||
"\n",
|
||
"Creative Graphic Designer with over 8 years of experience in visual design and branding. Proficient in Adobe Creative Suite and passionate about creating compelling visuals.\n",
|
||
"\n",
|
||
"Education:\n",
|
||
"\n",
|
||
"B.F.A. in Graphic Design, Rhode Island School of Design (2012)\n",
|
||
"Experience:\n",
|
||
"\n",
|
||
"Senior Graphic Designer, CreativeWorks Agency (2015 – Present)\n",
|
||
"Led design projects for clients in various industries.\n",
|
||
"Created branding materials that increased client engagement by 30%.\n",
|
||
"Graphic Designer, Visual Innovations (2012 – 2015)\n",
|
||
"Designed marketing collateral, including brochures, logos, and websites.\n",
|
||
"Collaborated with the marketing team to develop cohesive brand strategies.\n",
|
||
"Skills:\n",
|
||
"\n",
|
||
"Design Software: Adobe Photoshop, Illustrator, InDesign\n",
|
||
"Web Design: HTML, CSS\n",
|
||
"Specialties: Branding and Identity, Typography\n",
|
||
"\"\"\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 5
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "ca4ecc32721ad332",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:01.184899Z",
|
||
"start_time": "2024-12-24T11:54:01.183028Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"job_5 = \"\"\"\n",
|
||
"CV 5: Not Relevant\n",
|
||
"Name: Jessica Miller\n",
|
||
"Contact Information:\n",
|
||
"\n",
|
||
"Email: jessica.miller@example.com\n",
|
||
"Phone: (555) 567-8901\n",
|
||
"Summary:\n",
|
||
"\n",
|
||
"Experienced Sales Manager with a strong track record in driving sales growth and building high-performing teams. Excellent communication and leadership skills.\n",
|
||
"\n",
|
||
"Education:\n",
|
||
"\n",
|
||
"B.A. in Business Administration, University of Southern California (2010)\n",
|
||
"Experience:\n",
|
||
"\n",
|
||
"Sales Manager, Global Enterprises (2015 – Present)\n",
|
||
"Managed a sales team of 15 members, achieving a 20% increase in annual revenue.\n",
|
||
"Developed sales strategies that expanded customer base by 25%.\n",
|
||
"Sales Representative, Market Leaders Inc. (2010 – 2015)\n",
|
||
"Consistently exceeded sales targets and received the 'Top Salesperson' award in 2013.\n",
|
||
"Skills:\n",
|
||
"\n",
|
||
"Sales Strategy and Planning\n",
|
||
"Team Leadership and Development\n",
|
||
"CRM Software: Salesforce, Zoho\n",
|
||
"Negotiation and Relationship Building\n",
|
||
"\"\"\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 6
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4415446a",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Please add the necessary environment information bellow:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "bce39dc6",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:04.417315Z",
|
||
"start_time": "2024-12-24T11:54:04.414132Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"import os\n",
|
||
"\n",
|
||
"# Setting environment variables\n",
|
||
"if \"GRAPHISTRY_USERNAME\" not in os.environ: \n",
|
||
" os.environ[\"GRAPHISTRY_USERNAME\"] = \"\"\n",
|
||
"\n",
|
||
"if \"GRAPHISTRY_PASSWORD\" not in os.environ: \n",
|
||
" os.environ[\"GRAPHISTRY_PASSWORD\"] = \"\"\n",
|
||
"\n",
|
||
"if \"LLM_API_KEY\" not in os.environ:\n",
|
||
" os.environ[\"LLM_API_KEY\"] = \"\"\n",
|
||
"\n",
|
||
"# \"neo4j\" or \"networkx\"\n",
|
||
"os.environ[\"GRAPH_DATABASE_PROVIDER\"]=\"networkx\" \n",
|
||
"# Not needed if using networkx\n",
|
||
"#os.environ[\"GRAPH_DATABASE_URL\"]=\"\"\n",
|
||
"#os.environ[\"GRAPH_DATABASE_USERNAME\"]=\"\"\n",
|
||
"#os.environ[\"GRAPH_DATABASE_PASSWORD\"]=\"\"\n",
|
||
"\n",
|
||
"# \"pgvector\", \"qdrant\", \"weaviate\" or \"lancedb\"\n",
|
||
"os.environ[\"VECTOR_DB_PROVIDER\"]=\"lancedb\" \n",
|
||
"# Not needed if using \"lancedb\" or \"pgvector\"\n",
|
||
"# os.environ[\"VECTOR_DB_URL\"]=\"\"\n",
|
||
"# os.environ[\"VECTOR_DB_KEY\"]=\"\"\n",
|
||
"\n",
|
||
"# Relational Database provider \"sqlite\" or \"postgres\"\n",
|
||
"os.environ[\"DB_PROVIDER\"]=\"sqlite\"\n",
|
||
"\n",
|
||
"# Database name\n",
|
||
"os.environ[\"DB_NAME\"]=\"cognee_db\"\n",
|
||
"\n",
|
||
"# Postgres specific parameters (Only if Postgres or PGVector is used)\n",
|
||
"# os.environ[\"DB_HOST\"]=\"127.0.0.1\"\n",
|
||
"# os.environ[\"DB_PORT\"]=\"5432\"\n",
|
||
"# os.environ[\"DB_USERNAME\"]=\"cognee\"\n",
|
||
"# os.environ[\"DB_PASSWORD\"]=\"cognee\""
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 7
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "9f1a1dbd",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:16.672999Z",
|
||
"start_time": "2024-12-24T11:54:07.425202Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Reset the cognee system with the following command:\n",
|
||
"\n",
|
||
"import cognee\n",
|
||
"\n",
|
||
"await cognee.prune.prune_data()\n",
|
||
"await cognee.prune.prune_system(metadata=True)"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"File /Users/vasilije/cognee/cognee/.cognee_system/databases/cognee_graph.pkl not found. Initializing an empty graph."
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Database deleted successfully.\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"/Users/vasilije/cognee/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
||
" from .autonotebook import tqdm as notebook_tqdm\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 8
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "383d6971",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### After we have defined and gathered our data let's add it to cognee "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "904df61ba484a8e5",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:28.313862Z",
|
||
"start_time": "2024-12-24T11:54:23.756587Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"import cognee\n",
|
||
"\n",
|
||
"await cognee.add([job_1, job_2, job_3, job_4, job_5, job_position], \"example\")"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"User df77f15b-a077-4c86-a3e4-c059bf4cacb9 has registered.\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"/Users/vasilije/cognee/.venv/lib/python3.11/site-packages/dlt/destinations/impl/sqlalchemy/merge_job.py:194: SAWarning: Table 'file_metadata' already exists within the given MetaData - not copying.\n",
|
||
" staging_table_obj = table_obj.to_metadata(\n",
|
||
"/Users/vasilije/cognee/.venv/lib/python3.11/site-packages/dlt/destinations/impl/sqlalchemy/merge_job.py:229: SAWarning: implicitly coercing SELECT object to scalar subquery; please use the .scalar_subquery() method to produce a scalar subquery.\n",
|
||
" order_by=order_dir_func(order_by_col),\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Pipeline file_load_from_filesystem load step completed in 0.03 seconds\n",
|
||
"1 load package(s) were loaded to destination sqlalchemy and into dataset main\n",
|
||
"The sqlalchemy destination used sqlite:////Users/vasilije/cognee/cognee/.cognee_system/databases/cognee_db location to store data\n",
|
||
"Load package 1735041267.4777632 is LOADED and contains no failed jobs\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 9
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0f15c5b1",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### All good, let's cognify it."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "7c431fdef4921ae0",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:54:44.728010Z",
|
||
"start_time": "2024-12-24T11:54:44.723877Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"from cognee.shared.data_models import KnowledgeGraph\n",
|
||
"from cognee.modules.data.models import Dataset, Data\n",
|
||
"from cognee.modules.data.methods.get_dataset_data import get_dataset_data\n",
|
||
"from cognee.modules.cognify.config import get_cognify_config\n",
|
||
"from cognee.modules.pipelines.tasks.Task import Task\n",
|
||
"from cognee.modules.pipelines import run_tasks\n",
|
||
"from cognee.modules.users.models import User\n",
|
||
"from cognee.tasks.documents import check_permissions_on_documents, classify_documents, extract_chunks_from_documents\n",
|
||
"from cognee.tasks.graph import extract_graph_from_data\n",
|
||
"from cognee.tasks.storage import add_data_points\n",
|
||
"from cognee.tasks.summarization import summarize_text\n",
|
||
"\n",
|
||
"async def run_cognify_pipeline(dataset: Dataset, user: User = None):\n",
|
||
" data_documents: list[Data] = await get_dataset_data(dataset_id = dataset.id)\n",
|
||
"\n",
|
||
" try:\n",
|
||
" cognee_config = get_cognify_config()\n",
|
||
"\n",
|
||
" tasks = [\n",
|
||
" Task(classify_documents),\n",
|
||
" Task(check_permissions_on_documents, user = user, permissions = [\"write\"]),\n",
|
||
" Task(extract_chunks_from_documents), # Extract text chunks based on the document type.\n",
|
||
" Task(extract_graph_from_data, graph_model = KnowledgeGraph, task_config = { \"batch_size\": 10 }), # Generate knowledge graphs from the document chunks.\n",
|
||
" Task(\n",
|
||
" summarize_text,\n",
|
||
" summarization_model = cognee_config.summarization_model,\n",
|
||
" task_config = { \"batch_size\": 10 }\n",
|
||
" ),\n",
|
||
" Task(add_data_points, task_config = { \"batch_size\": 10 }),\n",
|
||
" ]\n",
|
||
"\n",
|
||
" pipeline = run_tasks(tasks, data_documents)\n",
|
||
"\n",
|
||
" async for result in pipeline:\n",
|
||
" print(result)\n",
|
||
" except Exception as error:\n",
|
||
" raise error\n"
|
||
],
|
||
"outputs": [],
|
||
"execution_count": 10
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "f0a91b99c6215e09",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T11:55:26.027386Z",
|
||
"start_time": "2024-12-24T11:54:47.384342Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"from cognee.modules.users.methods import get_default_user\n",
|
||
"from cognee.modules.data.methods import get_datasets_by_name\n",
|
||
"\n",
|
||
"user = await get_default_user()\n",
|
||
"\n",
|
||
"datasets = await get_datasets_by_name([\"example\"], user.id)\n",
|
||
"\n",
|
||
"await run_cognify_pipeline(datasets[0], user)"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"File /Users/vasilije/cognee/cognee/.cognee_system/databases/cognee_graph.pkl not found. Initializing an empty graph./Users/vasilije/cognee/.venv/lib/python3.11/site-packages/pydantic/main.py:1522: RuntimeWarning: fields may not start with an underscore, ignoring \"_metadata\"\n",
|
||
" warnings.warn(f'fields may not start with an underscore, ignoring \"{f_name}\"', RuntimeWarning)\n",
|
||
"/Users/vasilije/cognee/.venv/lib/python3.11/site-packages/pydantic/main.py:1522: RuntimeWarning: fields may not start with an underscore, ignoring \"__tablename__\"\n",
|
||
" warnings.warn(f'fields may not start with an underscore, ignoring \"{f_name}\"', RuntimeWarning)\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[TextSummary(id=UUID('92b5d0a7-f980-529d-bb5b-48e72825a01a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Experienced Senior Data Scientist with expertise in machine learning and predictive modeling, demonstrating over 8 years in the field.', made_from=DocumentChunk(id=UUID('70b823e2-5b12-57b5-ad8d-798e1d721f8e'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='\\nCV 1: Relevant\\nName: Dr. Emily Carter\\nContact Information:\\n\\nEmail: emily.carter@example.com\\nPhone: (555) 123-4567\\nSummary:\\n\\nSenior Data Scientist with over 8 years of experience in machine learning and predictive analytics. Expertise in developing advanced algorithms and deploying scalable models in production environments.\\n\\nEducation:\\n\\nPh.D. in Computer Science, Stanford University (2014)\\nB.S. in Mathematics, University of California, Berkeley (2010)\\nExperience:\\n\\nSenior Data Scientist, InnovateAI Labs (2016 – Present)\\nLed a team in developing machine learning models for natural language processing applications.\\nImplemented deep learning algorithms that improved prediction accuracy by 25%.\\nCollaborated with cross-functional teams to integrate models into cloud-based platforms.\\nData Scientist, DataWave Analytics (2014 – 2016)\\nDeveloped predictive models for customer segmentation and churn analysis.\\nAnalyzed large datasets using Hadoop and Spark frameworks.\\nSkills:\\n\\nProgramming Languages: Python, R, SQL\\nMachin e Learning: TensorFlow, Keras, Scikit-Learn\\nBig Data Technologies: Hadoop, Spark\\nData Visualization: Tableau, Matplotlib\\n', word_count=133, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('11a2b08a-c160-5961-80b7-b3498eafa973'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_85410f4ad1197f5974aef9aed6f103c8', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_85410f4ad1197f5974aef9aed6f103c8.txt', metadata_id=UUID('11a2b08a-c160-5961-80b7-b3498eafa973'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('29e771c8-4c3f-52de-9511-6b705878e130'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='dr. emily carter', is_a=EntityType(id=UUID('d072ba0f-e1a9-58bf-9974-e1802adc8134'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='person', description='person'), description='Senior Data Scientist with over 8 years of experience in machine learning and predictive analytics.'), Entity(id=UUID('ce8b394a-b30e-52fc-b80a-6352edc60e5b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='stanford university', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='Prestigious university located in Stanford, California.'), Entity(id=UUID('2c02c93c-9cd1-56b8-9cc0-55ff0b290e57'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='university of california, berkeley', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='Public research university located in Berkeley, California.'), Entity(id=UUID('9780afb1-dccc-53eb-9a30-c0d4ce033711'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='innovateai labs', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='A lab focused on artificial intelligence projects.'), Entity(id=UUID('50d0a685-5300-544f-b081-edca4b625886'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='datawave analytics', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='Analytics firm specialized in data-driven insights.'), Entity(id=UUID('c95db510-e2ee-5a00-bded-20bbcb50c492'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='python', is_a=EntityType(id=UUID('80d409bb-e431-5939-a1ad-3acd96267128'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='programming language', description='programming language'), description='A high-level programming language used for general-purpose programming.'), Entity(id=UUID('39bd9707-8098-52ed-9cbf-bbdd26b963fb'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='r', is_a=EntityType(id=UUID('80d409bb-e431-5939-a1ad-3acd96267128'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='programming language', description='programming language'), description='A programming language and environment for statistical computing and graphics.'), Entity(id=UUID('1ff6821a-b207-5050-83e9-37ff67a27d03'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='sql', is_a=EntityType(id=UUID('80d409bb-e431-5939-a1ad-3acd96267128'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='programming language', description='programming language'), description='A domain-specific language used in programming and managing relational databases.'), Entity(id=UUID('6e72f6f5-0452-5d42-a4e8-4aba6a614cb1'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='tensorflow', is_a=EntityType(id=UUID('9ffe9ce7-8938-5a5c-8d03-5f1a4c5210a1'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='machine learning framework', description='machine learning framework'), description='An open-source software library for dataflow and differentiable programming across a range of tasks.'), Entity(id=UUID('ab85cdff-2a98-5c6d-99a3-df1f40f4ec16'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='keras', is_a=EntityType(id=UUID('9ffe9ce7-8938-5a5c-8d03-5f1a4c5210a1'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='machine learning framework', description='machine learning framework'), description='An open-source neural network library written in Python that runs on top of TensorFlow.'), Entity(id=UUID('37eecdcc-fb56-519c-bc18-d0d3afea0c0d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='scikit-learn', is_a=EntityType(id=UUID('9ffe9ce7-8938-5a5c-8d03-5f1a4c5210a1'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='machine learning framework', description='machine learning framework'), description='A free software machine learning library for the Python programming language.'), Entity(id=UUID('f9a0eeca-c9ff-53b3-90eb-347254d7d7eb'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='hadoop', is_a=EntityType(id=UUID('7c2287d0-16fc-53dc-86ce-8d8e61c8642c'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='big data technology', description='big data technology'), description='An open-source framework for storing and processing large datasets in a distributed computing environment.'), Entity(id=UUID('46a235af-5ed5-5023-a4ec-c253e3f93031'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='spark', is_a=EntityType(id=UUID('7c2287d0-16fc-53dc-86ce-8d8e61c8642c'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='big data technology', description='big data technology'), description='An open-source unified analytics engine for large-scale data processing.'), Entity(id=UUID('c55004f3-8a6d-5130-b8bd-ed8278daa9a4'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='tableau', is_a=EntityType(id=UUID('674cc5fa-7849-575a-917f-90b7b77f52b3'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='data visualization software', description='data visualization software'), description='A visual analytics platform transforming the way we use data to solve problems.'), Entity(id=UUID('3c7adf8f-ef23-5330-a3fe-6a0b791cee2b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='matplotlib', is_a=EntityType(id=UUID('3f3619fc-ebd1-50ed-adde-cf94e8bb3c1b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='data visualization library', description='data visualization library'), description='A plotting library for the Python programming language and its numerical mathematics extension NumPy.')])), TextSummary(id=UUID('2f680bef-2edd-566e-b98c-78d549799e77'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Senior Data Scientist specializing in Machine Learning at TechNova Solutions', made_from=DocumentChunk(id=UUID('eb6617b8-c78c-519b-b765-1eefc2e3a0d7'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Senior Data Scientist (Machine Learning)\\n\\nCompany: TechNova Solutions\\nLocation: San Francisco, CA\\n\\nJob Description:\\n\\nTechNova Solutions is seeking a Senior Data Scientist specializing in Machine Learning to join our dynamic analytics team. The ideal candidate will have a strong background in developing and deploying machine learning models, working with large datasets, and translating complex data into actionable insights.\\n\\nResponsibilities:\\n\\nDevelop and implement advanced machine learning algorithms and models.\\nAnalyze large, complex datasets to extract meaningful patterns and insights.\\nCollaborate with cross-functional teams to integrate predictive models into products.\\nStay updated with the latest advancements in machine learning and data science.\\nMentor junior data scientists and provide technical guidance.\\nQualifications:\\n\\nMaster’s or Ph.D. in Data Science, Computer Science, Statistics, or a related field.\\n5+ years of experience in data science and machine learning.\\nProficient in Python, R, and SQL.\\nExpe rience with deep learning frameworks (e.g., TensorFlow, PyTorch).\\nStrong problem-solving skills and attention to detail.\\nCandidate CVs\\n', word_count=153, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('171f3035-4c37-5f7b-97c8-6b222404cc9a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_81a5a96a9a7325d40521ea453778ebe0', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_81a5a96a9a7325d40521ea453778ebe0.txt', metadata_id=UUID('171f3035-4c37-5f7b-97c8-6b222404cc9a'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('453a45c9-14e7-5b73-adb8-55991096fef0'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='technova solutions', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='A technology company specializing in data analytics and machine learning.'), Entity(id=UUID('435dbd37-ab20-503c-9e99-ab8b8a3484e5'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='senior data scientist', is_a=EntityType(id=UUID('524c6bbb-1534-5a51-8068-18dd4ae171eb'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='profession', description='profession'), description='A role focused on advanced data analysis and machine learning.'), Entity(id=UUID('198e2ab8-75e9-5931-97ab-da9a5a8e188c'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='san francisco, ca', is_a=EntityType(id=UUID('19dd7d4d-a966-5ed5-82a0-6ae377761a29'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='location', description='location'), description='A city in California, USA.'), Entity(id=UUID('5187986a-7305-5a63-b057-8f2c097419eb'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='machine learning', is_a=EntityType(id=UUID('0198571b-3e94-50ea-8b9f-19e3a31080c0'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='field', description='field'), description='A subset of artificial intelligence focused on the development of algorithms that enable computers to learn from data.'), Entity(id=UUID('d6545b21-153c-58ba-be47-46e5216521a3'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='data science', is_a=EntityType(id=UUID('0198571b-3e94-50ea-8b9f-19e3a31080c0'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='field', description='field'), description='A multidisciplinary field that uses scientific methods to extract knowledge and insights from data.'), Entity(id=UUID('c0d95499-de6b-5fcf-b0f5-9cbf427ad5c6'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='pytorch', is_a=EntityType(id=UUID('36a32bd3-8880-547a-949b-8447477d1ef5'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='framework', description='framework'), description='An open-source machine learning framework for deep learning.'), Entity(id=UUID('62b4dda1-de4a-5098-a56e-d3fe81f84dbc'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='masters or ph.d. in data science', is_a=EntityType(id=UUID('a49b283a-ce92-50e0-b7fa-ca7c628eb01a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='degree', description='degree'), description='Advanced academic degree in data science or related fields.')])), TextSummary(id=UUID('5c988618-db52-5979-9cf8-db80c0098285'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Data Scientist with expertise in machine learning and statistical analysis, adept at managing extensive datasets and converting data into practical business solutions.', made_from=DocumentChunk(id=UUID('a6e82ac7-e791-5d6b-b4a9-f5e41cbe95bf'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='\\nCV 2: Relevant\\nName: Michael Rodriguez\\nContact Information:\\n\\nEmail: michael.rodriguez@example.com\\nPhone: (555) 234-5678\\nSummary:\\n\\nData Scientist with a strong background in machine learning and statistical modeling. Skilled in handling large datasets and translating data into actionable business insights.\\n\\nEducation:\\n\\nM.S. in Data Science, Carnegie Mellon University (2013)\\nB.S. in Computer Science, University of Michigan (2011)\\nExperience:\\n\\nSenior Data Scientist, Alpha Analytics (2017 – Present)\\nDeveloped machine learning models to optimize marketing strategies.\\nReduced customer acquisition cost by 15% through predictive modeling.\\nData Scientist, TechInsights (2013 – 2017)\\nAnalyzed user behavior data to improve product features.\\nImplemented A/B testing frameworks to evaluate product changes.\\nSkills:\\n\\nProgramming Languages: Python, Java, SQL\\nMachine Learning: Scikit-Learn, XGBoost\\nData Visualization: Seaborn, Plotly\\nDatabases: MySQL, MongoDB\\n', word_count=108, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('1f078b0a-3cc1-57a9-9802-f78565d49f29'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_f0f63a5c88dbbeef1eca23d32848220c', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_f0f63a5c88dbbeef1eca23d32848220c.txt', metadata_id=UUID('1f078b0a-3cc1-57a9-9802-f78565d49f29'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('73ae630f-7b09-5dce-8c18-45d0a57b30f9'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='michael rodriguez', is_a=EntityType(id=UUID('d072ba0f-e1a9-58bf-9974-e1802adc8134'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='person', description='person'), description='Data Scientist with a strong background in machine learning and statistical modeling.'), Entity(id=UUID('5534e0b0-d0c4-5ab9-82e9-91bed36f70bd'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='carnegie mellon university', is_a=EntityType(id=UUID('912b273c-683d-53ea-8ffe-aadef0b84237'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='educational institution', description='educational institution'), description='University known for its data science program.'), Entity(id=UUID('0af613e0-c11b-550d-ada2-2c2aa6550884'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='university of michigan', is_a=EntityType(id=UUID('912b273c-683d-53ea-8ffe-aadef0b84237'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='educational institution', description='educational institution'), description='University known for its computer science program.'), Entity(id=UUID('04a91fef-8a07-5d50-8f1b-46f3afeec497'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='alpha analytics', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='Company where Michael Rodriguez works as a Senior Data Scientist.'), Entity(id=UUID('3f848ed6-902f-5a8e-9577-cb67f8c17acd'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='techinsights', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='Company where Michael Rodriguez worked as a Data Scientist.')])), TextSummary(id=UUID('ee6cb607-27eb-5b87-bf2a-305721534263'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Sales Manager with proven ability to enhance revenue and cultivate effective teams. Strong communicator and leader.', made_from=DocumentChunk(id=UUID('7e35407f-7c59-5429-8824-23f1d17118c0'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text=\"\\nCV 5: Not Relevant\\nName: Jessica Miller\\nContact Information:\\n\\nEmail: jessica.miller@example.com\\nPhone: (555) 567-8901\\nSummary:\\n\\nExperienced Sales Manager with a strong track record in driving sales growth and building high-performing teams. Excellent communication and leadership skills.\\n\\nEducation:\\n\\nB.A. in Business Administration, University of Southern California (2010)\\nExperience:\\n\\nSales Manager, Global Enterprises (2015 – Present)\\nManaged a sales team of 15 members, achieving a 20% increase in annual revenue.\\nDeveloped sales strategies that expanded customer base by 25%.\\nSales Representative, Market Leaders Inc. (2010 – 2015)\\nConsistently exceeded sales targets and received the 'Top Salesperson' award in 2013.\\nSkills:\\n\\nSales Strategy and Planning\\nTeam Leadership and Development\\nCRM Software: Salesforce, Zoho\\nNegotiation and Relationship Building\\n\", word_count=102, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('3c323fc9-9165-52da-a079-2627a9556b08'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_9b35c7df1f5d4dc84e78270c0bf9cac6', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_9b35c7df1f5d4dc84e78270c0bf9cac6.txt', metadata_id=UUID('3c323fc9-9165-52da-a079-2627a9556b08'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('36a5e3c8-c5f5-5ab5-8d59-ea69d8b36932'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='jessica miller', is_a=EntityType(id=UUID('d072ba0f-e1a9-58bf-9974-e1802adc8134'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='person', description='person'), description='An experienced sales manager with a strong track record in driving sales growth and building high-performing teams.'), Entity(id=UUID('5c32691d-c0e4-5378-9aab-dda8b0fa3931'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='global enterprises', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='A company where Jessica Miller worked as a Sales Manager.'), Entity(id=UUID('67544857-983a-5152-801d-4fc9d35d14e4'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='market leaders inc.', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='A company where Jessica Miller worked as a Sales Representative.'), Entity(id=UUID('f39d6c00-689b-5fd2-9021-893b28ac6ff2'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='university of southern california', is_a=EntityType(id=UUID('912b273c-683d-53ea-8ffe-aadef0b84237'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='educational institution', description='educational institution'), description='University where Jessica Miller obtained her degree in Business Administration.'), Entity(id=UUID('0abc801d-38ca-5003-b974-b60f1956c94a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='2010', is_a=EntityType(id=UUID('d61d99ac-b291-5666-9748-3e80e1c8b56a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='date', description='date'), description='Year Jessica Miller graduated from University of Southern California.'), Entity(id=UUID('7c8b43c1-e133-52e6-99aa-239534f1ed45'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='2015', is_a=EntityType(id=UUID('d61d99ac-b291-5666-9748-3e80e1c8b56a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='date', description='date'), description='Year Jessica Miller started working as Sales Manager at Global Enterprises.'), Entity(id=UUID('2f4749e9-e1e4-5af0-be80-2a10d07557ff'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='present', is_a=EntityType(id=UUID('d61d99ac-b291-5666-9748-3e80e1c8b56a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='date', description='date'), description=\"Current time indicative of Jessica Miller's ongoing role.\")])), TextSummary(id=UUID('d8a8668e-b122-5713-b289-932407bb294e'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Creative Graphic Designer with 8+ years of expertise in visual design and branding, skilled in Adobe Creative Suite, dedicated to crafting engaging visuals.', made_from=DocumentChunk(id=UUID('c401b5b1-21d8-5830-8c7b-48e7d94c5b95'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='\\nCV 4: Not Relevant\\nName: David Thompson\\nContact Information:\\n\\nEmail: david.thompson@example.com\\nPhone: (555) 456-7890\\nSummary:\\n\\nCreative Graphic Designer with over 8 years of experience in visual design and branding. Proficient in Adobe Creative Suite and passionate about creating compelling visuals.\\n\\nEducation:\\n\\nB.F.A. in Graphic Design, Rhode Island School of Design (2012)\\nExperience:\\n\\nSenior Graphic Designer, CreativeWorks Agency (2015 – Present)\\nLed design projects for clients in various industries.\\nCreated branding materials that increased client engagement by 30%.\\nGraphic Designer, Visual Innovations (2012 – 2015)\\nDesigned marketing collateral, including brochures, logos, and websites.\\nCollaborated with the marketing team to develop cohesive brand strategies.\\nSkills:\\n\\nDesign Software: Adobe Photoshop, Illustrator, InDesign\\nWeb Design: HTML, CSS\\nSpecialties: Branding and Identity, Typography\\n', word_count=108, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('e71daf63-15a0-50fe-a909-766bc8fd311b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_9abf20fa7defd7e49296c51b4e38edf2', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_9abf20fa7defd7e49296c51b4e38edf2.txt', metadata_id=UUID('e71daf63-15a0-50fe-a909-766bc8fd311b'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('a4777597-06c7-562c-bc44-56f74571a01a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='david thompson', is_a=EntityType(id=UUID('d072ba0f-e1a9-58bf-9974-e1802adc8134'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='person', description='person'), description='Creative Graphic Designer with over 8 years of experience in visual design and branding.'), Entity(id=UUID('ca20272a-3e88-552f-92fe-491e23f117f8'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='creativeworks agency', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='An agency where David Thompson is a Senior Graphic Designer.'), Entity(id=UUID('1e97bb97-4d29-5fb8-863a-15ab51f1dd46'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='visual innovations', is_a=EntityType(id=UUID('d3d7b6b4-9b0d-52e8-9e09-a9e9cf4b5a4d'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='organization', description='organization'), description='An organization where David Thompson worked as a Graphic Designer.'), Entity(id=UUID('60b027fe-7bb4-535d-8a47-19f1a491591b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='rhode island school of design', is_a=EntityType(id=UUID('b5866225-05ad-5cfc-908e-c22916c6a1c6'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='institution', description='institution'), description='Educational institution where David Thompson earned his B.F.A. in Graphic Design.'), Entity(id=UUID('7e3df89c-2691-580b-84dc-378cb1df3db6'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='adobe creative suite', is_a=EntityType(id=UUID('2d66edc2-1e14-55ab-8304-680b514a597a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='software', description='software'), description='A suite of graphic design, video editing, and web development applications.'), Entity(id=UUID('2a0f9b58-c695-5bad-baa2-fd2da02da013'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='html', is_a=EntityType(id=UUID('c90c7d6b-3532-5dcf-91e1-4a0e1f179794'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='language', description='language'), description='A markup language for creating web pages and applications.'), Entity(id=UUID('9b062f3c-fe02-5427-9b44-b287a1cac367'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='css', is_a=EntityType(id=UUID('c90c7d6b-3532-5dcf-91e1-4a0e1f179794'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='language', description='language'), description='A stylesheet language used for describing the presentation of a document written in HTML.')])), TextSummary(id=UUID('8aedca6b-fa78-5987-a79b-3b0bebff8eb1'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='Data Scientist with 6 years of expertise in machine learning, focused on utilizing data for business improvement and product enhancement.', made_from=DocumentChunk(id=UUID('00692e43-9f02-54d0-b695-44bf47342d36'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, text='\\nCV 3: Relevant\\nName: Sarah Nguyen\\nContact Information:\\n\\nEmail: sarah.nguyen@example.com\\nPhone: (555) 345-6789\\nSummary:\\n\\nData Scientist specializing in machine learning with 6 years of experience. Passionate about leveraging data to drive business solutions and improve product performance.\\n\\nEducation:\\n\\nM.S. in Statistics, University of Washington (2014)\\nB.S. in Applied Mathematics, University of Texas at Austin (2012)\\nExperience:\\n\\nData Scientist, QuantumTech (2016 – Present)\\nDesigned and implemented machine learning algorithms for financial forecasting.\\nImproved model efficiency by 20% through algorithm optimization.\\nJunior Data Scientist, DataCore Solutions (2014 – 2016)\\nAssisted in developing predictive models for supply chain optimization.\\nConducted data cleaning and preprocessing on large datasets.\\nSkills:\\n\\nProgramming Languages: Python, R\\nMachine Learning Frameworks: PyTorch, Scikit-Learn\\nStatistical Analysis: SAS, SPSS\\nCloud Platforms: AWS, Azure\\n', word_count=110, chunk_index=0, cut_type='sentence_cut', is_part_of=TextDocument(id=UUID('e7d6246b-e414-5b9d-8daa-6d4434b7fa47'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='text_f8c7482e727f228001b0046ed68d656f', raw_data_location='/Users/vasilije/cognee/cognee/.data_storage/data/text_f8c7482e727f228001b0046ed68d656f.txt', metadata_id=UUID('e7d6246b-e414-5b9d-8daa-6d4434b7fa47'), mime_type='text/plain', type='text'), contains=[Entity(id=UUID('4d8dda57-2681-5264-a2bd-e2ddfe66a785'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='sarah nguyen', is_a=EntityType(id=UUID('d072ba0f-e1a9-58bf-9974-e1802adc8134'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='person', description='person'), description='Data Scientist specializing in machine learning with 6 years of experience.'), Entity(id=UUID('ae74a35b-d5f1-5622-ade1-6703d5e069fb'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='university of washington', is_a=EntityType(id=UUID('912b273c-683d-53ea-8ffe-aadef0b84237'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='educational institution', description='educational institution'), description='University located in Seattle, Washington.'), Entity(id=UUID('301b3cf8-5a5c-585e-80bd-f79901e4368c'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='university of texas at austin', is_a=EntityType(id=UUID('912b273c-683d-53ea-8ffe-aadef0b84237'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='educational institution', description='educational institution'), description='Public research university located in Austin, Texas.'), Entity(id=UUID('0d980f2a-09dd-581e-acc3-cc2d87c1bab4'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='quantumtech', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='Company where Sarah Nguyen works as a Data Scientist from 2016 to present.'), Entity(id=UUID('95ac0551-38fc-5187-a422-533aeb7e8db0'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='datacore solutions', is_a=EntityType(id=UUID('a6ed6bf1-fe31-5dfe-8ab4-484691fdf219'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='company', description='company'), description='Company where Sarah Nguyen worked as a Junior Data Scientist from 2014 to 2016.'), Entity(id=UUID('3edcdf3f-25af-57a3-8878-8008bd7ea05a'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='aws', is_a=EntityType(id=UUID('d84d991a-dab3-5c36-8806-df076ccb731b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='cloud platform', description='cloud platform'), description='Amazon Web Services, a cloud computing platform.'), Entity(id=UUID('8b431923-4aa2-5886-a661-b8de0f888a9b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='azure', is_a=EntityType(id=UUID('d84d991a-dab3-5c36-8806-df076ccb731b'), updated_at=datetime.datetime(2024, 12, 24, 11, 54, 13, 481297, tzinfo=datetime.timezone.utc), topological_rank=0, name='cloud platform', description='cloud platform'), description='Microsoft Azure, a cloud computing service created by Microsoft.')]))]\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 11
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "219a6d41",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### We get the url to the graph on graphistry in the notebook cell bellow, showing nodes and connections made by the cognify process."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "080389e5",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T16:55:51.378129Z",
|
||
"start_time": "2024-12-29T16:55:46.922951Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"import os\n",
|
||
"from cognee.shared.utils import render_graph\n",
|
||
"from cognee.infrastructure.databases.graph import get_graph_engine\n",
|
||
"import graphistry\n",
|
||
"from dotenv import load_dotenv \n",
|
||
"\n",
|
||
"load_dotenv()\n",
|
||
"os.environ[\"GRAPHISTRY_USERNAME\"] = \"vasilije\"\n",
|
||
"os.environ[\"GRAPHISTRY_PASSWORD\"] = \"wV!4yzNGjDLCpwV\"\n",
|
||
"\n",
|
||
"\n",
|
||
"\n",
|
||
"graphistry.login(username=os.getenv(\"GRAPHISTRY_USERNAME\"), password=os.getenv(\"GRAPHISTRY_PASSWORD\"))\n",
|
||
"\n",
|
||
"graph_engine = await get_graph_engine()\n",
|
||
"\n",
|
||
"graph_url = await render_graph(graph_engine.graph)\n",
|
||
"print(graph_url)"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Graph is visualized at: https://hub.graphistry.com/graph/graph.html?dataset=cc21b1d2d6074323aa37af53e693b1a4&type=arrow&viztoken=db05565e-79e9-4fe3-99b2-b7a2e6d48eff&usertag=5f822e63-pygraphistry-0.33.9&splashAfter=1735491366&info=true\n",
|
||
"https://hub.graphistry.com/graph/graph.html?dataset=cc21b1d2d6074323aa37af53e693b1a4&type=arrow&viztoken=db05565e-79e9-4fe3-99b2-b7a2e6d48eff&usertag=5f822e63-pygraphistry-0.33.9&splashAfter=1735491366&info=true\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 12
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T17:27:20.635684Z",
|
||
"start_time": "2024-12-29T17:27:18.253572Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": "!pip install cairosvg",
|
||
"id": "1995cab3f32a393b",
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Collecting cairosvg\r\n",
|
||
" Obtaining dependency information for cairosvg from https://files.pythonhosted.org/packages/01/a5/1866b42151f50453f1a0d28fc4c39f5be5f412a2e914f33449c42daafdf1/CairoSVG-2.7.1-py3-none-any.whl.metadata\r\n",
|
||
" Downloading CairoSVG-2.7.1-py3-none-any.whl.metadata (2.7 kB)\r\n",
|
||
"Collecting cairocffi (from cairosvg)\r\n",
|
||
" Obtaining dependency information for cairocffi from https://files.pythonhosted.org/packages/93/d8/ba13451aa6b745c49536e87b6bf8f629b950e84bd0e8308f7dc6883b67e2/cairocffi-1.7.1-py3-none-any.whl.metadata\r\n",
|
||
" Downloading cairocffi-1.7.1-py3-none-any.whl.metadata (3.3 kB)\r\n",
|
||
"Collecting cssselect2 (from cairosvg)\r\n",
|
||
" Obtaining dependency information for cssselect2 from https://files.pythonhosted.org/packages/9d/3a/e39436efe51894243ff145a37c4f9a030839b97779ebcc4f13b3ba21c54e/cssselect2-0.7.0-py3-none-any.whl.metadata\r\n",
|
||
" Downloading cssselect2-0.7.0-py3-none-any.whl.metadata (2.9 kB)\r\n",
|
||
"Requirement already satisfied: defusedxml in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cairosvg) (0.7.1)\r\n",
|
||
"Requirement already satisfied: pillow in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cairosvg) (11.0.0)\r\n",
|
||
"Requirement already satisfied: tinycss2 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cairosvg) (1.4.0)\r\n",
|
||
"Requirement already satisfied: cffi>=1.1.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cairocffi->cairosvg) (1.17.1)\r\n",
|
||
"Requirement already satisfied: webencodings in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cssselect2->cairosvg) (0.5.1)\r\n",
|
||
"Requirement already satisfied: pycparser in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from cffi>=1.1.0->cairocffi->cairosvg) (2.22)\r\n",
|
||
"Downloading CairoSVG-2.7.1-py3-none-any.whl (43 kB)\r\n",
|
||
"\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m43.2/43.2 kB\u001B[0m \u001B[31m1.6 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\r\n",
|
||
"\u001B[?25hDownloading cairocffi-1.7.1-py3-none-any.whl (75 kB)\r\n",
|
||
"\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m75.6/75.6 kB\u001B[0m \u001B[31m4.3 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\r\n",
|
||
"\u001B[?25hDownloading cssselect2-0.7.0-py3-none-any.whl (15 kB)\r\n",
|
||
"Installing collected packages: cssselect2, cairocffi, cairosvg\r\n",
|
||
"Successfully installed cairocffi-1.7.1 cairosvg-2.7.1 cssselect2-0.7.0\r\n",
|
||
"\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m24.3.1\u001B[0m\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 25
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T22:05:06.689621Z",
|
||
"start_time": "2024-12-29T22:05:06.688268Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": "\n",
|
||
"id": "7014c8acf720b50",
|
||
"outputs": [],
|
||
"execution_count": 70
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T22:27:10.842044Z",
|
||
"start_time": "2024-12-29T22:27:10.826453Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"from PIL import Image\n",
|
||
"import potrace\n",
|
||
"\n",
|
||
"def png_to_svg(png_path, threshold=128):\n",
|
||
" \"\"\"\n",
|
||
" Converts a PNG image to an SVG string using potrace and prints the output.\n",
|
||
"\n",
|
||
" :param png_path: Path to the input PNG file.\n",
|
||
" :param threshold: Grayscale threshold for converting to black-and-white (default: 128).\n",
|
||
" :return: The SVG content as a string.\n",
|
||
" \"\"\"\n",
|
||
" # Load the PNG image and convert to grayscale\n",
|
||
" img = Image.open(png_path).convert(\"L\")\n",
|
||
"\n",
|
||
" # Convert image to black-and-white\n",
|
||
" img = img.point(lambda p: p > threshold and 255)\n",
|
||
"\n",
|
||
" # Convert the image to a bitmap for potrace\n",
|
||
" bitmap = potrace.Bitmap(img)\n",
|
||
"\n",
|
||
" # Trace the bitmap into a path\n",
|
||
" path = bitmap.trace()\n",
|
||
"\n",
|
||
" # Generate SVG output\n",
|
||
" svg_content = '<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\\n'\n",
|
||
" svg_content += '<svg xmlns=\"http://www.w3.org/2000/svg\">\\n'\n",
|
||
" for curve in path:\n",
|
||
" svg_content += f' <path d=\"{curve.tosvg()}\" fill=\"black\"/>\\n'\n",
|
||
" svg_content += '</svg>\\n'\n",
|
||
"\n",
|
||
" # Print the SVG content\n",
|
||
" print(svg_content)\n",
|
||
"\n",
|
||
" # Return the SVG content as a string\n",
|
||
" return svg_content\n"
|
||
],
|
||
"id": "494ff0eafc9892db",
|
||
"outputs": [
|
||
{
|
||
"ename": "ModuleNotFoundError",
|
||
"evalue": "No module named 'potrace'",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
|
||
"\u001B[0;31mModuleNotFoundError\u001B[0m Traceback (most recent call last)",
|
||
"Cell \u001B[0;32mIn[87], line 2\u001B[0m\n\u001B[1;32m 1\u001B[0m \u001B[38;5;28;01mfrom\u001B[39;00m \u001B[38;5;21;01mPIL\u001B[39;00m \u001B[38;5;28;01mimport\u001B[39;00m Image\n\u001B[0;32m----> 2\u001B[0m \u001B[38;5;28;01mimport\u001B[39;00m \u001B[38;5;21;01mpotrace\u001B[39;00m\n\u001B[1;32m 4\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21mpng_to_svg\u001B[39m(png_path, threshold\u001B[38;5;241m=\u001B[39m\u001B[38;5;241m128\u001B[39m):\n\u001B[1;32m 5\u001B[0m \u001B[38;5;250m \u001B[39m\u001B[38;5;124;03m\"\"\"\u001B[39;00m\n\u001B[1;32m 6\u001B[0m \u001B[38;5;124;03m Converts a PNG image to an SVG string using potrace and prints the output.\u001B[39;00m\n\u001B[1;32m 7\u001B[0m \n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 10\u001B[0m \u001B[38;5;124;03m :return: The SVG content as a string.\u001B[39;00m\n\u001B[1;32m 11\u001B[0m \u001B[38;5;124;03m \"\"\"\u001B[39;00m\n",
|
||
"\u001B[0;31mModuleNotFoundError\u001B[0m: No module named 'potrace'"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 87
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T22:27:11.546173Z",
|
||
"start_time": "2024-12-29T22:27:11.533929Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"import os\n",
|
||
"from typing import BinaryIO, Union\n",
|
||
"\n",
|
||
"import requests\n",
|
||
"import hashlib\n",
|
||
"from datetime import datetime, timezone\n",
|
||
"import graphistry\n",
|
||
"import networkx as nx\n",
|
||
"import numpy as np\n",
|
||
"import pandas as pd\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"import tiktoken\n",
|
||
"import nltk\n",
|
||
"import base64\n",
|
||
"import networkx as nx\n",
|
||
"from bokeh.io import output_file, save\n",
|
||
"from bokeh.plotting import figure, from_networkx\n",
|
||
"from bokeh.models import Circle, MultiLine, HoverTool, ColumnDataSource, Range1d\n",
|
||
"\n",
|
||
"\n",
|
||
"from cognee.base_config import get_base_config\n",
|
||
"from cognee.infrastructure.databases.graph import get_graph_engine\n",
|
||
"\n",
|
||
"from uuid import uuid4\n",
|
||
"import pathlib\n",
|
||
"\n",
|
||
"from cognee.shared.exceptions import IngestionError\n",
|
||
"\n",
|
||
"\n",
|
||
"async def create_cognee_style_network_with_logo(\n",
|
||
" G,\n",
|
||
" output_filename: str = \"cognee_network_with_logo.html\",\n",
|
||
" title: str = \"Cognee-Style Network\",\n",
|
||
" node_attribute: str = \"group\",\n",
|
||
" layout_func=nx.spring_layout,\n",
|
||
" layout_scale: float = 3.0,\n",
|
||
" logo_alpha: float = 0.1, # Transparency of the logo\n",
|
||
"):\n",
|
||
" \"\"\"\n",
|
||
" Create a Cognee-inspired network visualization with an embedded logo as a watermark.\n",
|
||
"\n",
|
||
" :param G: Graph data.\n",
|
||
" :param output_filename: The HTML file where the visualization will be saved.\n",
|
||
" :param title: Title of the visualization.\n",
|
||
" :param node_attribute: Node attribute used to color the nodes.\n",
|
||
" :param layout_func: Layout function from NetworkX (e.g., nx.spring_layout).\n",
|
||
" :param layout_scale: Scale of the layout.\n",
|
||
" :param logo_alpha: Transparency of the logo (0 = fully transparent, 1 = fully opaque).\n",
|
||
" \"\"\"\n",
|
||
"\n",
|
||
" # Ensure every node has the specified attribute; otherwise \"Unknown\"\n",
|
||
" \n",
|
||
" (nodes, edges) = G\n",
|
||
" networkx_graph = nx.MultiDiGraph()\n",
|
||
"\n",
|
||
" networkx_graph.add_nodes_from(nodes)\n",
|
||
" networkx_graph.add_edges_from(edges)\n",
|
||
"\n",
|
||
" G = networkx_graph\n",
|
||
" new_G = nx.MultiDiGraph() if isinstance(G, nx.MultiDiGraph) else nx.Graph()\n",
|
||
" \n",
|
||
" # Convert nodes and their attributes\n",
|
||
" for node, data in G.nodes(data=True):\n",
|
||
" print(data.items())\n",
|
||
" import uuid\n",
|
||
" serializable_data = {k: str(v) if isinstance(v, uuid.UUID) else v for k, v in data.items()}\n",
|
||
" new_G.add_node(str(node), **serializable_data)\n",
|
||
" \n",
|
||
" # Convert edges and their attributes\n",
|
||
" for u, v, data in G.edges(data=True):\n",
|
||
" import uuid\n",
|
||
" serializable_data = {k: str(v) if isinstance(v, uuid.UUID) else v for k, v in data.items()}\n",
|
||
" new_G.add_edge(str(u), str(v), **serializable_data)\n",
|
||
" \n",
|
||
" G = new_G # Use the new graph\n",
|
||
"\n",
|
||
" # Prepare Bokeh output\n",
|
||
" output_file(output_filename)\n",
|
||
"\n",
|
||
" # Create a figure with light gray background\n",
|
||
" p = figure(\n",
|
||
" title=title,\n",
|
||
" tools=\"pan,wheel_zoom,save,reset,hover\",\n",
|
||
" active_scroll=\"wheel_zoom\",\n",
|
||
" width=1200,\n",
|
||
" height=900,\n",
|
||
" background_fill_color=\"#F4F4F4\", # Light Gray\n",
|
||
" toolbar_location=\"below\",\n",
|
||
" x_range=Range1d(-layout_scale, layout_scale),\n",
|
||
" y_range=Range1d(-layout_scale, layout_scale),\n",
|
||
" )\n",
|
||
" p.toolbar.logo = None\n",
|
||
" p.axis.visible = False\n",
|
||
" p.grid.visible = False\n",
|
||
" p.outline_line_color = None\n",
|
||
"\n",
|
||
" import cairosvg\n",
|
||
" \n",
|
||
" svg_logo =\"\"\"<svg width=\"1294\" height=\"324\" viewBox=\"0 0 1294 324\" fill=\"none\" xmlns=\"http://www.w3.org/2000/svg\">\n",
|
||
" <mask id=\"mask0_103_2579\" style=\"mask-type:alpha\" maskUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"1294\" height=\"324\">\n",
|
||
" <path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M380.648 131.09C365.133 131.09 353.428 142.843 353.428 156.285V170.258C353.428 183.7 365.133 195.452 380.648 195.452C388.268 195.452 393.57 193.212 401.288 187.611C405.57 184.506 411.579 185.449 414.682 189.714C417.805 193.978 416.842 199.953 412.561 203.038C402.938 209.995 393.727 214.515 380.628 214.515C355.49 214.555 334.241 195.197 334.241 170.258V156.285C334.241 131.366 355.49 112.008 380.648 112.008C393.747 112.008 402.958 116.528 412.581 123.485C416.862 126.59 417.805 132.545 414.702 136.809C411.579 141.074 405.589 142.017 401.308 138.912C393.59 133.331 388.268 131.071 380.667 131.071L380.648 131.09ZM474.875 131.09C459.792 131.09 447.557 143.255 447.557 158.289V168.509C447.557 183.543 459.792 195.708 474.875 195.708C489.958 195.708 501.977 183.602 501.977 168.509V158.289C501.977 143.196 489.879 131.09 474.875 131.09ZM428.37 158.289C428.37 132.741 449.188 112.008 474.875 112.008C500.563 112.008 521.164 132.8 521.164 158.289V168.509C521.164 193.998 500.622 214.79 474.875 214.79C449.129 214.79 428.37 194.057 428.37 168.509V158.289ZM584.774 131.601C569.652 131.601 557.457 143.747 557.457 158.683C557.457 173.618 569.672 185.764 584.774 185.764C599.877 185.764 611.876 173.697 611.876 158.683C611.876 143.668 599.818 131.601 584.774 131.601ZM538.269 158.683C538.269 133.154 559.126 112.519 584.774 112.519C595.693 112.519 605.67 116.253 613.545 122.483L620.733 115.329C624.484 111.595 630.552 111.595 634.303 115.329C638.054 119.063 638.054 125.096 634.303 128.83L625.819 137.281C629.178 143.688 631.063 150.979 631.063 158.702C631.063 184.152 610.501 204.866 584.774 204.866C584.519 204.866 584.264 204.866 584.008 204.866H563.643C560.226 204.866 557.457 207.617 557.457 211.017C557.457 214.417 560.226 217.168 563.643 217.168H589.939H598.345C605.258 217.168 612.426 219.075 618.18 223.614C624.131 228.292 627.901 235.229 628.569 243.739C629.747 258.812 619.123 269.11 610.482 272.431L586.444 283.004C581.593 285.127 575.937 282.945 573.796 278.131C571.655 273.316 573.855 267.675 578.686 265.553L602.96 254.882C603.137 254.803 603.333 254.724 603.51 254.665C604.531 254.292 606.259 253.191 607.614 251.364C608.871 249.674 609.598 247.649 609.421 245.252C609.146 241.754 607.811 239.808 606.259 238.609C604.551 237.253 601.84 236.271 598.325 236.271H564.036C563.937 236.271 563.839 236.271 563.721 236.271H563.604C549.601 236.271 538.23 224.97 538.23 211.037C538.23 201.997 543.002 194.077 550.171 189.616C542.747 181.44 538.23 170.612 538.23 158.722L538.269 158.683ZM694.045 131.601C679.021 131.601 666.825 143.727 666.825 158.683V205.239C666.825 210.506 662.525 214.79 657.242 214.79C651.959 214.79 647.658 210.526 647.658 205.239V158.683C647.658 133.193 668.436 112.519 694.065 112.519C719.693 112.519 740.471 133.193 740.471 158.683V205.239C740.471 210.506 736.17 214.79 730.887 214.79C725.605 214.79 721.304 210.526 721.304 205.239V158.683C721.304 143.727 709.128 131.601 694.084 131.601H694.045ZM807.204 131.621C791.748 131.621 779.356 143.963 779.356 159.017V168.843C779.356 183.897 791.748 196.238 807.204 196.238C812.565 196.238 817.514 194.745 821.698 192.19C826.214 189.439 832.126 190.834 834.895 195.334C837.664 199.835 836.25 205.711 831.733 208.462C824.604 212.825 816.179 215.321 807.204 215.321C781.3 215.321 760.169 194.588 760.169 168.843V159.017C760.169 133.272 781.3 112.538 807.204 112.538C829.357 112.538 847.778 127.671 852.707 148.07L854.632 156.049L813.744 172.597C808.834 174.581 803.237 172.243 801.234 167.349C799.231 162.475 801.587 156.894 806.497 154.909L830.947 145.004C826.156 136.986 817.338 131.601 807.165 131.601L807.204 131.621ZM912.37 131.621C896.914 131.621 884.522 143.963 884.522 159.017V168.843C884.522 183.897 896.914 196.238 912.37 196.238C917.732 196.238 922.681 194.745 926.864 192.19C928.965 190.913 930.89 189.36 932.559 187.572C936.192 183.72 942.261 183.543 946.11 187.139C949.979 190.736 950.175 196.789 946.542 200.621C943.694 203.628 940.454 206.281 936.879 208.462C929.731 212.825 921.326 215.321 912.331 215.321C886.427 215.321 865.296 194.588 865.296 168.843V159.017C865.296 133.272 886.427 112.538 912.331 112.538C934.484 112.538 952.905 127.671 957.834 148.07L959.759 156.049L918.871 172.597C913.961 174.581 908.364 172.243 906.361 167.349C904.358 162.475 906.714 156.894 911.624 154.909L936.074 145.004C931.282 136.986 922.465 131.601 912.292 131.601L912.37 131.621Z\" fill=\"#6510F4\"/>\n",
|
||
" </mask>\n",
|
||
" <g mask=\"url(#mask0_103_2579)\">\n",
|
||
" <rect x=\"86\" y=\"-119\" width=\"1120\" height=\"561\" fill=\"#6510F4\"/>\n",
|
||
" </g>\n",
|
||
" </svg>\"\"\"\n",
|
||
"\n",
|
||
"\n",
|
||
" png_data = cairosvg.svg2png(bytestring=svg_logo.encode(\"utf-8\"))\n",
|
||
" logo_base64 = base64.b64encode(png_data).decode(\"utf-8\")\n",
|
||
" logo_url = f\"data:image/png;base64,{logo_base64}\"\n",
|
||
"\n",
|
||
" # Add the logo as an image at the center of the graph\n",
|
||
" # Position the logo behind the graph using CSS layering\n",
|
||
" p.image_url(\n",
|
||
" url=[logo_url],\n",
|
||
" x=-layout_scale * 0.5, # Adjusted position to center\n",
|
||
" y=layout_scale * 0.5, # Adjusted position to center\n",
|
||
" w=layout_scale * 1.2, # Width of the logo\n",
|
||
" h=layout_scale * 1.2, # Height of the logo\n",
|
||
" anchor=\"center\",\n",
|
||
" global_alpha=logo_alpha, # Transparency for watermark effect\n",
|
||
" )\n",
|
||
" p.image_url(\n",
|
||
" url=[logo_url],\n",
|
||
" x=-layout_scale * -0.5, # Adjusted position to center\n",
|
||
" y=layout_scale * -0.5, # Adjusted position to center\n",
|
||
" w=layout_scale * 1.2, # Width of the logo\n",
|
||
" h=layout_scale * 1.2, # Height of the logo\n",
|
||
" anchor=\"center\",\n",
|
||
" global_alpha=logo_alpha, # Transparency for watermark effect\n",
|
||
" )\n",
|
||
"\n",
|
||
" # Generate graph layout\n",
|
||
" layout_positions = {str(node): position for node, position in layout_func(G).items()}\n",
|
||
"\n",
|
||
" graph_renderer = from_networkx(\n",
|
||
" G,\n",
|
||
" layout_positions,\n",
|
||
" scale=layout_scale,\n",
|
||
" center=(0, 0),\n",
|
||
" )\n",
|
||
"\n",
|
||
" # Compute node sizes based on centrality\n",
|
||
" centrality = nx.degree_centrality(G)\n",
|
||
" node_radii = [0.02 + 0.1 * centrality[node] for node in G.nodes()]\n",
|
||
" graph_renderer.node_renderer.data_source.data[\"radius\"] = node_radii\n",
|
||
"\n",
|
||
" # Apply Cognee-inspired colors for nodes\n",
|
||
" cognee_colors = [\"#6510F4\", \"#0DFF00\", \"#FFFFFF\"] # Violet, Green, White\n",
|
||
" unique_attrs = list({\n",
|
||
" G.nodes[node].get(node_attribute, G.nodes[node].get('id', 'Unknown')) \n",
|
||
" for node in G.nodes()\n",
|
||
" })\n",
|
||
" color_map = {attr: cognee_colors[i % len(cognee_colors)] for i, attr in enumerate(unique_attrs)}\n",
|
||
" node_colors = [\n",
|
||
" color_map[G.nodes[node].get(node_attribute, G.nodes[node].get('id', 'Unknown'))] \n",
|
||
" for node in G.nodes()\n",
|
||
"]\n",
|
||
"\n",
|
||
" graph_renderer.node_renderer.data_source.data[\"fill_color\"] = node_colors\n",
|
||
"\n",
|
||
" # Style nodes\n",
|
||
" graph_renderer.node_renderer.glyph = Circle(\n",
|
||
" radius=\"radius\",\n",
|
||
" fill_color=\"fill_color\",\n",
|
||
" fill_alpha=0.9,\n",
|
||
" line_color=\"#000000\", # Abyss Black outline\n",
|
||
" line_width=1.5,\n",
|
||
" )\n",
|
||
" graph_renderer.node_renderer.hover_glyph = Circle(\n",
|
||
" radius=\"radius\",\n",
|
||
" fill_color=\"#FFFFFF\", # White fill on hover\n",
|
||
" fill_alpha=1.0,\n",
|
||
" line_color=\"#6510F4\", # Violet outline on hover\n",
|
||
" line_width=2.5,\n",
|
||
" )\n",
|
||
"\n",
|
||
" # Style edges\n",
|
||
" graph_renderer.edge_renderer.glyph = MultiLine(\n",
|
||
" line_color=\"#000000\", # Abyss Black edges\n",
|
||
" line_alpha=0.3,\n",
|
||
" line_width=1.5,\n",
|
||
" )\n",
|
||
" graph_renderer.edge_renderer.hover_glyph = MultiLine(\n",
|
||
" line_color=\"#0DFF00\", # Green on hover\n",
|
||
" line_alpha=0.8,\n",
|
||
" line_width=2.0,\n",
|
||
" )\n",
|
||
"\n",
|
||
" # Hover tool for node tooltips\n",
|
||
" graph_renderer.node_renderer.data_source.data[node_attribute] = [\n",
|
||
" G.nodes[node].get(node_attribute, G.nodes[node].get('id', 'Unknown')) \n",
|
||
" for node in G.nodes()\n",
|
||
" ]\n",
|
||
" hover_tool = HoverTool(\n",
|
||
" tooltips=[\n",
|
||
" (\"Node\", \"@index\"),\n",
|
||
" (node_attribute.capitalize(), f\"@{node_attribute}\"),\n",
|
||
" (\"Centrality\", \"@radius{0.00}\"),\n",
|
||
" ],\n",
|
||
" renderers=[graph_renderer],\n",
|
||
" )\n",
|
||
" p.add_tools(hover_tool)\n",
|
||
"\n",
|
||
" # Add the graph to the plot\n",
|
||
" p.renderers.append(graph_renderer)\n",
|
||
" from bokeh.io import output_notebook, show\n",
|
||
"\n",
|
||
" # Save the result\n",
|
||
" output_notebook()\n",
|
||
"\n",
|
||
" # Display the plot in the notebook\n",
|
||
" show(p)\n",
|
||
" print(f\"Cognee-style network with logo saved to '{output_filename}'\")\n",
|
||
" from bokeh.embed import file_html\n",
|
||
" from bokeh.resources import CDN\n",
|
||
" \n",
|
||
" html_content = file_html(p, CDN, title=\"Cognee-style Network\")\n",
|
||
"\n",
|
||
" # Save HTML to file if needed\n",
|
||
" with open(output_filename, \"w\") as f:\n",
|
||
" f.write(html_content)\n",
|
||
" print(f\"HTML plot saved to '{output_filename}'\")\n",
|
||
"\n",
|
||
" # Return the HTML content as a string\n",
|
||
" return html_content\n"
|
||
],
|
||
"id": "296c8298f25d53d8",
|
||
"outputs": [],
|
||
"execution_count": 88
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-30T09:50:38.625634Z",
|
||
"start_time": "2024-12-30T09:50:38.614895Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"import networkx as nx\n",
|
||
"from bokeh.plotting import figure, output_file, show\n",
|
||
"from bokeh.models import Circle, MultiLine, HoverTool, Range1d\n",
|
||
"from bokeh.io import output_notebook\n",
|
||
"from bokeh.embed import file_html\n",
|
||
"from bokeh.resources import CDN\n",
|
||
"from bokeh.plotting import figure, from_networkx\n",
|
||
"import base64\n",
|
||
"import cairosvg\n",
|
||
"import logging\n",
|
||
"\n",
|
||
"logging.basicConfig(level=logging.INFO)\n",
|
||
"\n",
|
||
"def convert_to_serializable_graph(G):\n",
|
||
" \"\"\"\n",
|
||
" Convert a graph into a serializable format with stringified node and edge attributes.\n",
|
||
" \"\"\"\n",
|
||
" (nodes, edges) = G\n",
|
||
" networkx_graph = nx.MultiDiGraph()\n",
|
||
"\n",
|
||
" networkx_graph.add_nodes_from(nodes)\n",
|
||
" networkx_graph.add_edges_from(edges)\n",
|
||
" G = networkx_graph\n",
|
||
" new_G = nx.MultiDiGraph() if isinstance(G, nx.MultiDiGraph) else nx.Graph()\n",
|
||
" for node, data in G.nodes(data=True):\n",
|
||
" serializable_data = {k: str(v) for k, v in data.items()}\n",
|
||
" new_G.add_node(str(node), **serializable_data)\n",
|
||
" for u, v, data in G.edges(data=True):\n",
|
||
" serializable_data = {k: str(v) for k, v in data.items()}\n",
|
||
" new_G.add_edge(str(u), str(v), **serializable_data)\n",
|
||
" return new_G\n",
|
||
"\n",
|
||
"def generate_layout_positions(G, layout_func, layout_scale):\n",
|
||
" \"\"\"\n",
|
||
" Generate layout positions for the graph using the specified layout function.\n",
|
||
" \"\"\"\n",
|
||
" positions = layout_func(G)\n",
|
||
" return {str(node): (x * layout_scale, y * layout_scale) for node, (x, y) in positions.items()}\n",
|
||
"\n",
|
||
"def assign_node_colors(G, node_attribute, palette):\n",
|
||
" \"\"\"\n",
|
||
" Assign colors to nodes based on a specified attribute and a given palette.\n",
|
||
" \"\"\"\n",
|
||
" unique_attrs = set(G.nodes[node].get(node_attribute, \"Unknown\") for node in G.nodes)\n",
|
||
" color_map = {attr: palette[i % len(palette)] for i, attr in enumerate(unique_attrs)}\n",
|
||
" return [color_map[G.nodes[node].get(node_attribute, \"Unknown\")] for node in G.nodes], color_map\n",
|
||
"\n",
|
||
"def embed_logo(p, layout_scale, logo_alpha):\n",
|
||
" \"\"\"\n",
|
||
" Embed a logo into the graph visualization as a watermark.\n",
|
||
" \"\"\"\n",
|
||
" svg_logo=\"\"\"<svg width=\"1294\" height=\"324\" viewBox=\"0 0 1294 324\" fill=\"none\" xmlns=\"http://www.w3.org/2000/svg\">\n",
|
||
" <mask id=\"mask0_103_2579\" style=\"mask-type:alpha\" maskUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"1294\" height=\"324\">\n",
|
||
" <path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M380.648 131.09C365.133 131.09 353.428 142.843 353.428 156.285V170.258C353.428 183.7 365.133 195.452 380.648 195.452C388.268 195.452 393.57 193.212 401.288 187.611C405.57 184.506 411.579 185.449 414.682 189.714C417.805 193.978 416.842 199.953 412.561 203.038C402.938 209.995 393.727 214.515 380.628 214.515C355.49 214.555 334.241 195.197 334.241 170.258V156.285C334.241 131.366 355.49 112.008 380.648 112.008C393.747 112.008 402.958 116.528 412.581 123.485C416.862 126.59 417.805 132.545 414.702 136.809C411.579 141.074 405.589 142.017 401.308 138.912C393.59 133.331 388.268 131.071 380.667 131.071L380.648 131.09ZM474.875 131.09C459.792 131.09 447.557 143.255 447.557 158.289V168.509C447.557 183.543 459.792 195.708 474.875 195.708C489.958 195.708 501.977 183.602 501.977 168.509V158.289C501.977 143.196 489.879 131.09 474.875 131.09ZM428.37 158.289C428.37 132.741 449.188 112.008 474.875 112.008C500.563 112.008 521.164 132.8 521.164 158.289V168.509C521.164 193.998 500.622 214.79 474.875 214.79C449.129 214.79 428.37 194.057 428.37 168.509V158.289ZM584.774 131.601C569.652 131.601 557.457 143.747 557.457 158.683C557.457 173.618 569.672 185.764 584.774 185.764C599.877 185.764 611.876 173.697 611.876 158.683C611.876 143.668 599.818 131.601 584.774 131.601ZM538.269 158.683C538.269 133.154 559.126 112.519 584.774 112.519C595.693 112.519 605.67 116.253 613.545 122.483L620.733 115.329C624.484 111.595 630.552 111.595 634.303 115.329C638.054 119.063 638.054 125.096 634.303 128.83L625.819 137.281C629.178 143.688 631.063 150.979 631.063 158.702C631.063 184.152 610.501 204.866 584.774 204.866C584.519 204.866 584.264 204.866 584.008 204.866H563.643C560.226 204.866 557.457 207.617 557.457 211.017C557.457 214.417 560.226 217.168 563.643 217.168H589.939H598.345C605.258 217.168 612.426 219.075 618.18 223.614C624.131 228.292 627.901 235.229 628.569 243.739C629.747 258.812 619.123 269.11 610.482 272.431L586.444 283.004C581.593 285.127 575.937 282.945 573.796 278.131C571.655 273.316 573.855 267.675 578.686 265.553L602.96 254.882C603.137 254.803 603.333 254.724 603.51 254.665C604.531 254.292 606.259 253.191 607.614 251.364C608.871 249.674 609.598 247.649 609.421 245.252C609.146 241.754 607.811 239.808 606.259 238.609C604.551 237.253 601.84 236.271 598.325 236.271H564.036C563.937 236.271 563.839 236.271 563.721 236.271H563.604C549.601 236.271 538.23 224.97 538.23 211.037C538.23 201.997 543.002 194.077 550.171 189.616C542.747 181.44 538.23 170.612 538.23 158.722L538.269 158.683ZM694.045 131.601C679.021 131.601 666.825 143.727 666.825 158.683V205.239C666.825 210.506 662.525 214.79 657.242 214.79C651.959 214.79 647.658 210.526 647.658 205.239V158.683C647.658 133.193 668.436 112.519 694.065 112.519C719.693 112.519 740.471 133.193 740.471 158.683V205.239C740.471 210.506 736.17 214.79 730.887 214.79C725.605 214.79 721.304 210.526 721.304 205.239V158.683C721.304 143.727 709.128 131.601 694.084 131.601H694.045ZM807.204 131.621C791.748 131.621 779.356 143.963 779.356 159.017V168.843C779.356 183.897 791.748 196.238 807.204 196.238C812.565 196.238 817.514 194.745 821.698 192.19C826.214 189.439 832.126 190.834 834.895 195.334C837.664 199.835 836.25 205.711 831.733 208.462C824.604 212.825 816.179 215.321 807.204 215.321C781.3 215.321 760.169 194.588 760.169 168.843V159.017C760.169 133.272 781.3 112.538 807.204 112.538C829.357 112.538 847.778 127.671 852.707 148.07L854.632 156.049L813.744 172.597C808.834 174.581 803.237 172.243 801.234 167.349C799.231 162.475 801.587 156.894 806.497 154.909L830.947 145.004C826.156 136.986 817.338 131.601 807.165 131.601L807.204 131.621ZM912.37 131.621C896.914 131.621 884.522 143.963 884.522 159.017V168.843C884.522 183.897 896.914 196.238 912.37 196.238C917.732 196.238 922.681 194.745 926.864 192.19C928.965 190.913 930.89 189.36 932.559 187.572C936.192 183.72 942.261 183.543 946.11 187.139C949.979 190.736 950.175 196.789 946.542 200.621C943.694 203.628 940.454 206.281 936.879 208.462C929.731 212.825 921.326 215.321 912.331 215.321C886.427 215.321 865.296 194.588 865.296 168.843V159.017C865.296 133.272 886.427 112.538 912.331 112.538C934.484 112.538 952.905 127.671 957.834 148.07L959.759 156.049L918.871 172.597C913.961 174.581 908.364 172.243 906.361 167.349C904.358 162.475 906.714 156.894 911.624 154.909L936.074 145.004C931.282 136.986 922.465 131.601 912.292 131.601L912.37 131.621Z\" fill=\"#6510F4\"/>\n",
|
||
" </mask>\n",
|
||
" <g mask=\"url(#mask0_103_2579)\">\n",
|
||
" <rect x=\"86\" y=\"-119\" width=\"1120\" height=\"561\" fill=\"#6510F4\"/>\n",
|
||
" </g>\n",
|
||
" </svg>\"\"\" # Add your SVG content here\n",
|
||
" png_data = cairosvg.svg2png(bytestring=svg_logo.encode(\"utf-8\"))\n",
|
||
" logo_base64 = base64.b64encode(png_data).decode(\"utf-8\")\n",
|
||
" logo_url = f\"data:image/png;base64,{logo_base64}\"\n",
|
||
" p.image_url(\n",
|
||
" url=[logo_url],\n",
|
||
" x=-layout_scale * 0.5,\n",
|
||
" y=layout_scale * 0.5,\n",
|
||
" w=layout_scale,\n",
|
||
" h=layout_scale,\n",
|
||
" anchor=\"center\",\n",
|
||
" global_alpha=logo_alpha,\n",
|
||
" )\n",
|
||
"\n",
|
||
"def style_and_render_graph(p, G, layout_positions, node_attribute, node_colors, centrality):\n",
|
||
" \"\"\"\n",
|
||
" Apply styling and render the graph into the plot.\n",
|
||
" \"\"\"\n",
|
||
" graph_renderer = from_networkx(G, layout_positions)\n",
|
||
" node_radii = [0.02 + 0.1 * centrality[node] for node in G.nodes()]\n",
|
||
" graph_renderer.node_renderer.data_source.data[\"radius\"] = node_radii\n",
|
||
" graph_renderer.node_renderer.data_source.data[\"fill_color\"] = node_colors\n",
|
||
" graph_renderer.node_renderer.glyph = Circle(\n",
|
||
" radius=\"radius\",\n",
|
||
" fill_color=\"fill_color\",\n",
|
||
" fill_alpha=0.9,\n",
|
||
" line_color=\"#000000\",\n",
|
||
" line_width=1.5,\n",
|
||
" )\n",
|
||
" graph_renderer.edge_renderer.glyph = MultiLine(\n",
|
||
" line_color=\"#000000\",\n",
|
||
" line_alpha=0.3,\n",
|
||
" line_width=1.5,\n",
|
||
" )\n",
|
||
" p.renderers.append(graph_renderer)\n",
|
||
" return graph_renderer\n",
|
||
"\n",
|
||
"def create_cognee_style_network_with_logo(\n",
|
||
" G,\n",
|
||
" output_filename=\"cognee_netwdzsfsdfsdgo.html\",\n",
|
||
" title=\"Cognee-Style Network\",\n",
|
||
" node_attribute=\"group\",\n",
|
||
" layout_func=nx.spring_layout,\n",
|
||
" layout_scale=3.0,\n",
|
||
" logo_alpha=0.1,\n",
|
||
"):\n",
|
||
" \"\"\"\n",
|
||
" Create a Cognee-inspired network visualization with an embedded logo.\n",
|
||
" \"\"\"\n",
|
||
" logging.info(\"Converting graph to serializable format...\")\n",
|
||
" G = convert_to_serializable_graph(G)\n",
|
||
"\n",
|
||
" logging.info(\"Generating layout positions...\")\n",
|
||
" layout_positions = generate_layout_positions(G, layout_func, layout_scale)\n",
|
||
"\n",
|
||
" logging.info(\"Assigning node colors...\")\n",
|
||
" palette = [\"#6510F4\", \"#0DFF00\", \"#FFFFFF\"]\n",
|
||
" node_colors, color_map = assign_node_colors(G, node_attribute, palette)\n",
|
||
"\n",
|
||
" logging.info(\"Calculating centrality...\")\n",
|
||
" centrality = nx.degree_centrality(G)\n",
|
||
"\n",
|
||
" logging.info(\"Preparing Bokeh output...\")\n",
|
||
" output_file(output_filename)\n",
|
||
" p = figure(\n",
|
||
" title=title,\n",
|
||
" tools=\"pan,wheel_zoom,save,reset,hover\",\n",
|
||
" active_scroll=\"wheel_zoom\",\n",
|
||
" width=1200,\n",
|
||
" height=900,\n",
|
||
" background_fill_color=\"#F4F4F4\",\n",
|
||
" x_range=Range1d(-layout_scale, layout_scale),\n",
|
||
" y_range=Range1d(-layout_scale, layout_scale),\n",
|
||
" )\n",
|
||
" p.toolbar.logo = None\n",
|
||
" p.axis.visible = False\n",
|
||
" p.grid.visible = False\n",
|
||
"\n",
|
||
" logging.info(\"Embedding logo into visualization...\")\n",
|
||
" embed_logo(p, layout_scale, logo_alpha)\n",
|
||
"\n",
|
||
" logging.info(\"Styling and rendering graph...\")\n",
|
||
" style_and_render_graph(p, G, layout_positions, node_attribute, node_colors, centrality)\n",
|
||
"\n",
|
||
" logging.info(\"Adding hover tool...\")\n",
|
||
" hover_tool = HoverTool(\n",
|
||
" tooltips=[\n",
|
||
" (\"Node\", \"@index\"),\n",
|
||
" (node_attribute.capitalize(), f\"@{node_attribute}\"),\n",
|
||
" (\"Centrality\", \"@radius{0.00}\"),\n",
|
||
" ],\n",
|
||
" )\n",
|
||
" p.add_tools(hover_tool)\n",
|
||
" # from bokeh.io import output_notebook, show\n",
|
||
" # \n",
|
||
" # # Save the result\n",
|
||
" # output_notebook()\n",
|
||
" # \n",
|
||
" # # Display the plot in the notebook\n",
|
||
" # show(p)\n",
|
||
"\n",
|
||
" logging.info(f\"Saving visualization to {output_filename}...\")\n",
|
||
" html_content = file_html(p, CDN, title)\n",
|
||
" with open(output_filename, \"w\") as f:\n",
|
||
" f.write(html_content)\n",
|
||
" # \n",
|
||
" # logging.info(\"Visualization complete.\")\n",
|
||
" print(html_content)\n",
|
||
"\n",
|
||
" # return html_content\n",
|
||
"\n",
|
||
"\n",
|
||
"def graph_to_tuple(graph):\n",
|
||
" \"\"\"\n",
|
||
" Converts a networkx graph to a tuple of (nodes, edges).\n",
|
||
"\n",
|
||
" :param graph: A networkx graph.\n",
|
||
" :return: A tuple (nodes, edges).\n",
|
||
" \"\"\"\n",
|
||
" nodes = list(graph.nodes(data=True)) # Get nodes with attributes\n",
|
||
" edges = list(graph.edges(data=True)) # Get edges with attributes\n",
|
||
" return (nodes, edges)"
|
||
],
|
||
"id": "bf001eeda0f4f450",
|
||
"outputs": [],
|
||
"execution_count": 110
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-30T09:50:39.237084Z",
|
||
"start_time": "2024-12-30T09:50:39.184107Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"\n",
|
||
"import networkx as nx\n",
|
||
"\n",
|
||
"# graph_data = await graph_engine.get_graph_data()\n",
|
||
"# \n",
|
||
"# print(graph_data)\n",
|
||
"G = nx.random_geometric_graph(50, 0.3)\n",
|
||
"# Assign random group attributes for coloring\n",
|
||
"for i, node in enumerate(G.nodes()):\n",
|
||
" G.nodes[node][\"group\"] = f\"Group {i % 3 + 1}\"\n",
|
||
"def graph_to_tuple(graph):\n",
|
||
" \"\"\"\n",
|
||
" Converts a networkx graph to a tuple of (nodes, edges).\n",
|
||
" \n",
|
||
" :param graph: A networkx graph.\n",
|
||
" :return: A tuple (nodes, edges).\n",
|
||
" \"\"\"\n",
|
||
" nodes = list(graph.nodes(data=True)) # Get nodes with attributes\n",
|
||
" edges = list(graph.edges(data=True)) # Get edges with attributes\n",
|
||
" return (nodes, edges)\n",
|
||
"\n",
|
||
"G= graph_to_tuple(G)\n",
|
||
" \n",
|
||
"print(G)\n",
|
||
"\n",
|
||
"create_cognee_style_network_with_logo(\n",
|
||
" G,\n",
|
||
" output_filename=\"cognee_style_network_with_logo.html\",\n",
|
||
" title=\"Cognee-Graph Network\",\n",
|
||
" node_attribute=\"group\",\n",
|
||
" layout_func=nx.spring_layout,\n",
|
||
" layout_scale=3.0, \n",
|
||
")"
|
||
],
|
||
"id": "75ff7ff88fd5894e",
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"INFO:root:Converting graph to serializable format...INFO:root:Generating layout positions...INFO:root:Assigning node colors...INFO:root:Calculating centrality...INFO:root:Preparing Bokeh output...INFO:bokeh.io.state:Session output file 'cognee_style_network_with_logo.html' already exists, will be overwritten.INFO:root:Embedding logo into visualization...INFO:root:Styling and rendering graph...INFO:root:Adding hover tool...INFO:root:Saving visualization to cognee_style_network_with_logo.html..."
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"([(0, {'pos': [0.6391927561230611, 0.9577081939968699], 'group': 'Group 1'}), (1, {'pos': [0.7742314200503037, 0.8238027878749569], 'group': 'Group 2'}), (2, {'pos': [0.5512540135677992, 0.9153577565777516], 'group': 'Group 3'}), (3, {'pos': [0.25732444952662925, 0.1490908794049587], 'group': 'Group 1'}), (4, {'pos': [0.6279736736286188, 0.10785263935795697], 'group': 'Group 2'}), (5, {'pos': [0.914880874135223, 0.74802072749348], 'group': 'Group 3'}), (6, {'pos': [0.26667034242888454, 0.35251088926605467], 'group': 'Group 1'}), (7, {'pos': [0.5120452199291847, 0.03985698500510293], 'group': 'Group 2'}), (8, {'pos': [0.6328184134196475, 0.7746435028883368], 'group': 'Group 3'}), (9, {'pos': [0.9987954567088807, 0.14034841131467046], 'group': 'Group 1'}), (10, {'pos': [0.12058819726432679, 0.13241169932006103], 'group': 'Group 2'}), (11, {'pos': [0.22508864454761268, 0.6079455800309562], 'group': 'Group 3'}), (12, {'pos': [0.11725977735714843, 0.1411642545174081], 'group': 'Group 1'}), (13, {'pos': [0.919394680890905, 0.42166241371429103], 'group': 'Group 2'}), (14, {'pos': [0.25836207293422575, 0.7258827350068001], 'group': 'Group 3'}), (15, {'pos': [0.5172530162885245, 0.3000006396486069], 'group': 'Group 1'}), (16, {'pos': [0.15298901445139557, 0.6194415824148096], 'group': 'Group 2'}), (17, {'pos': [0.5503858383395439, 0.2963881895674162], 'group': 'Group 3'}), (18, {'pos': [0.3129369332017773, 0.9521497094216804], 'group': 'Group 1'}), (19, {'pos': [0.06277239466252482, 0.38715809100060106], 'group': 'Group 2'}), (20, {'pos': [0.5823930417754898, 0.5818352592143017], 'group': 'Group 3'}), (21, {'pos': [0.11925372303762072, 0.23308929448109605], 'group': 'Group 1'}), (22, {'pos': [0.9639456495249757, 0.5916791753212867], 'group': 'Group 2'}), (23, {'pos': [0.3588059198785577, 0.04938679848326155], 'group': 'Group 3'}), (24, {'pos': [0.5490274008918267, 0.06720020630140588], 'group': 'Group 1'}), (25, {'pos': [0.602429225429183, 0.4562231010090445], 'group': 'Group 2'}), (26, {'pos': [0.09188236933925253, 0.047929414091373634], 'group': 'Group 3'}), (27, {'pos': [0.06377074952137485, 0.2797118663108954], 'group': 'Group 1'}), (28, {'pos': [0.4906626665413346, 0.4663320694792623], 'group': 'Group 2'}), (29, {'pos': [0.6172692666513304, 0.6263884205262166], 'group': 'Group 3'}), (30, {'pos': [0.49741178043085055, 0.01644316714237193], 'group': 'Group 1'}), (31, {'pos': [0.7988288141788339, 0.059965469974182284], 'group': 'Group 2'}), (32, {'pos': [0.7217970607301775, 0.2708413010090548], 'group': 'Group 3'}), (33, {'pos': [0.011511237425662069, 0.028343774982744208], 'group': 'Group 1'}), (34, {'pos': [0.38128826432700813, 0.2895041685619608], 'group': 'Group 2'}), (35, {'pos': [0.07417032576014304, 0.636863466707599], 'group': 'Group 3'}), (36, {'pos': [0.13676493570949355, 0.376877180606942], 'group': 'Group 1'}), (37, {'pos': [0.9323413821739948, 0.8826597647832115], 'group': 'Group 2'}), (38, {'pos': [0.3894337987935055, 0.26448206953319053], 'group': 'Group 3'}), (39, {'pos': [0.8254125197486137, 0.05070075106309235], 'group': 'Group 1'}), (40, {'pos': [0.2743690977404899, 0.29681368611004266], 'group': 'Group 2'}), (41, {'pos': [0.941395039956768, 0.6449589631657039], 'group': 'Group 3'}), (42, {'pos': [0.611788621954416, 0.4739744957109453], 'group': 'Group 1'}), (43, {'pos': [0.2942512190192522, 0.0937049487795869], 'group': 'Group 2'}), (44, {'pos': [0.23196649324366891, 0.16136392127068122], 'group': 'Group 3'}), (45, {'pos': [0.08688565032870654, 0.75331756935591], 'group': 'Group 1'}), (46, {'pos': [0.1510509171930553, 0.9969999993249501], 'group': 'Group 2'}), (47, {'pos': [0.5439996268152169, 0.07151361555903424], 'group': 'Group 3'}), (48, {'pos': [0.17916289804000352, 0.41119976642291267], 'group': 'Group 1'}), (49, {'pos': [0.8746527854119696, 0.5086303883085483], 'group': 'Group 2'})], [(0, 1, {}), (0, 2, {}), (0, 8, {}), (1, 2, {}), (1, 5, {}), (1, 8, {}), (1, 22, {}), (1, 29, {}), (1, 37, {}), (1, 41, {}), (2, 8, {}), (2, 18, {}), (2, 29, {}), (3, 6, {}), (3, 7, {}), (3, 10, {}), (3, 12, {}), (3, 21, {}), (3, 23, {}), (3, 26, {}), (3, 27, {}), (3, 30, {}), (3, 33, {}), (3, 34, {}), (3, 36, {}), (3, 38, {}), (3, 40, {}), (3, 43, {}), (3, 44, {}), (3, 47, {}), (3, 48, {}), (4, 7, {}), (4, 15, {}), (4, 17, {}), (4, 23, {}), (4, 24, {}), (4, 30, {}), (4, 31, {}), (4, 32, {}), (4, 38, {}), (4, 39, {}), (4, 47, {}), (5, 8, {}), (5, 22, {}), (5, 37, {}), (5, 41, {}), (5, 49, {}), (6, 10, {}), (6, 11, {}), (6, 12, {}), (6, 15, {}), (6, 16, {}), (6, 17, {}), (6, 19, {}), (6, 21, {}), (6, 27, {}), (6, 28, {}), (6, 34, {}), (6, 36, {}), (6, 38, {}), (6, 40, {}), (6, 43, {}), (6, 44, {}), (6, 48, {}), (7, 15, {}), (7, 17, {}), (7, 23, {}), (7, 24, {}), (7, 30, {}), (7, 31, {}), (7, 34, {}), (7, 38, {}), (7, 43, {}), (7, 47, {}), (8, 20, {}), (8, 29, {}), (9, 13, {}), (9, 31, {}), (9, 39, {}), (10, 12, {}), (10, 19, {}), (10, 21, {}), (10, 23, {}), (10, 26, {}), (10, 27, {}), (10, 33, {}), (10, 36, {}), (10, 38, {}), (10, 40, {}), (10, 43, {}), (10, 44, {}), (10, 48, {}), (11, 14, {}), (11, 16, {}), (11, 19, {}), (11, 35, {}), (11, 36, {}), (11, 45, {}), (11, 48, {}), (12, 19, {}), (12, 21, {}), (12, 23, {}), (12, 26, {}), (12, 27, {}), (12, 33, {}), (12, 36, {}), (12, 38, {}), (12, 40, {}), (12, 43, {}), (12, 44, {}), (12, 48, {}), (13, 22, {}), (13, 32, {}), (13, 41, {}), (13, 49, {}), (14, 16, {}), (14, 18, {}), (14, 35, {}), (14, 45, {}), (14, 46, {}), (15, 17, {}), (15, 20, {}), (15, 23, {}), (15, 24, {}), (15, 25, {}), (15, 28, {}), (15, 30, {}), (15, 32, {}), (15, 34, {}), (15, 38, {}), (15, 40, {}), (15, 42, {}), (15, 47, {}), (16, 19, {}), (16, 35, {}), (16, 36, {}), (16, 45, {}), (16, 48, {}), (17, 20, {}), (17, 24, {}), (17, 25, {}), (17, 28, {}), (17, 30, {}), (17, 32, {}), (17, 34, {}), (17, 38, {}), (17, 40, {}), (17, 42, {}), (17, 47, {}), (18, 46, {}), (19, 21, {}), (19, 27, {}), (19, 35, {}), (19, 36, {}), (19, 40, {}), (19, 44, {}), (19, 48, {}), (20, 25, {}), (20, 28, {}), (20, 29, {}), (20, 42, {}), (21, 26, {}), (21, 27, {}), (21, 33, {}), (21, 34, {}), (21, 36, {}), (21, 38, {}), (21, 40, {}), (21, 43, {}), (21, 44, {}), (21, 48, {}), (22, 37, {}), (22, 41, {}), (22, 49, {}), (23, 24, {}), (23, 26, {}), (23, 30, {}), (23, 34, {}), (23, 38, {}), (23, 40, {}), (23, 43, {}), (23, 44, {}), (23, 47, {}), (24, 30, {}), (24, 31, {}), (24, 32, {}), (24, 34, {}), (24, 38, {}), (24, 39, {}), (24, 43, {}), (24, 47, {}), (25, 28, {}), (25, 29, {}), (25, 32, {}), (25, 34, {}), (25, 38, {}), (25, 42, {}), (25, 49, {}), (26, 27, {}), (26, 33, {}), (26, 43, {}), (26, 44, {}), (27, 33, {}), (27, 36, {}), (27, 40, {}), (27, 43, {}), (27, 44, {}), (27, 48, {}), (28, 29, {}), (28, 34, {}), (28, 38, {}), (28, 40, {}), (28, 42, {}), (29, 42, {}), (29, 49, {}), (30, 34, {}), (30, 38, {}), (30, 43, {}), (30, 47, {}), (31, 32, {}), (31, 39, {}), (31, 47, {}), (32, 39, {}), (32, 42, {}), (32, 47, {}), (32, 49, {}), (33, 43, {}), (33, 44, {}), (34, 36, {}), (34, 38, {}), (34, 40, {}), (34, 42, {}), (34, 43, {}), (34, 44, {}), (34, 47, {}), (34, 48, {}), (35, 36, {}), (35, 45, {}), (35, 48, {}), (36, 38, {}), (36, 40, {}), (36, 44, {}), (36, 48, {}), (37, 41, {}), (38, 40, {}), (38, 43, {}), (38, 44, {}), (38, 47, {}), (38, 48, {}), (39, 47, {}), (40, 43, {}), (40, 44, {}), (40, 48, {}), (41, 49, {}), (42, 49, {}), (43, 44, {}), (43, 47, {}), (44, 48, {}), (45, 46, {})])\n",
|
||
"<!DOCTYPE html>\n",
|
||
"<html lang=\"en\">\n",
|
||
" <head>\n",
|
||
" <meta charset=\"utf-8\">\n",
|
||
" <title>Cognee-Graph Network</title>\n",
|
||
" <style>\n",
|
||
" html, body {\n",
|
||
" box-sizing: border-box;\n",
|
||
" display: flow-root;\n",
|
||
" height: 100%;\n",
|
||
" margin: 0;\n",
|
||
" padding: 0;\n",
|
||
" }\n",
|
||
" </style>\n",
|
||
" <script type=\"text/javascript\" src=\"https://cdn.bokeh.org/bokeh/release/bokeh-3.6.2.min.js\"></script>\n",
|
||
" <script type=\"text/javascript\">\n",
|
||
" Bokeh.set_log_level(\"info\");\n",
|
||
" </script>\n",
|
||
" </head>\n",
|
||
" <body>\n",
|
||
" <div id=\"d64c1a7a-b66d-4c81-acec-907537260e4a\" data-root-id=\"p7816\" style=\"display: contents;\"></div>\n",
|
||
" \n",
|
||
" <script type=\"application/json\" id=\"c389e08f-4d74-4d25-a3d1-a7ad2850e518\">\n",
|
||
" {\"4e8e9336-b495-4592-ae17-d9ab2bc2ac1d\":{\"version\":\"3.6.2\",\"title\":\"Bokeh Application\",\"roots\":[{\"type\":\"object\",\"name\":\"Figure\",\"id\":\"p7816\",\"attributes\":{\"width\":1200,\"height\":900,\"x_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p7814\",\"attributes\":{\"start\":-3.0,\"end\":3.0}},\"y_range\":{\"type\":\"object\",\"name\":\"Range1d\",\"id\":\"p7815\",\"attributes\":{\"start\":-3.0,\"end\":3.0}},\"x_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p7826\"},\"y_scale\":{\"type\":\"object\",\"name\":\"LinearScale\",\"id\":\"p7827\"},\"title\":{\"type\":\"object\",\"name\":\"Title\",\"id\":\"p7819\",\"attributes\":{\"text\":\"Cognee-Graph Network\"}},\"renderers\":[{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p7849\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p7843\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p7844\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p7845\"},\"data\":{\"type\":\"map\",\"entries\":[[\"url\",[\"\"]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p7850\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p7851\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"ImageURL\",\"id\":\"p7846\",\"attributes\":{\"url\":{\"type\":\"field\",\"field\":\"url\"},\"x\":{\"type\":\"value\",\"value\":-1.5},\"y\":{\"type\":\"value\",\"value\":1.5},\"w\":{\"type\":\"value\",\"value\":3.0},\"h\":{\"type\":\"value\",\"value\":3.0},\"global_alpha\":{\"type\":\"value\",\"value\":0.1},\"anchor\":\"center\"}},\"nonselection_glyph\":{\"type\":\"object\",\"name\":\"ImageURL\",\"id\":\"p7847\",\"attributes\":{\"url\":{\"type\":\"field\",\"field\":\"url\"},\"x\":{\"type\":\"value\",\"value\":-1.5},\"y\":{\"type\":\"value\",\"value\":1.5},\"w\":{\"type\":\"value\",\"value\":3.0},\"h\":{\"type\":\"value\",\"value\":3.0},\"global_alpha\":{\"type\":\"value\",\"value\":0.1},\"anchor\":\"center\"}},\"muted_glyph\":{\"type\":\"object\",\"name\":\"ImageURL\",\"id\":\"p7848\",\"attributes\":{\"url\":{\"type\":\"field\",\"field\":\"url\"},\"x\":{\"type\":\"value\",\"value\":-1.5},\"y\":{\"type\":\"value\",\"value\":1.5},\"w\":{\"type\":\"value\",\"value\":3.0},\"h\":{\"type\":\"value\",\"value\":3.0},\"global_alpha\":{\"type\":\"value\",\"value\":0.2},\"anchor\":\"center\"}}}},{\"type\":\"object\",\"name\":\"GraphRenderer\",\"id\":\"p7852\",\"attributes\":{\"layout_provider\":{\"type\":\"object\",\"name\":\"StaticLayoutProvider\",\"id\":\"p7869\",\"attributes\":{\"graph_layout\":{\"type\":\"map\",\"entries\":[[\"0\",[-0.30878409006941265,-0.9284358034378248]],[\"1\",[-0.36187718360731014,-1.402961560761843]],[\"2\",[-0.8347947421248498,-0.6692243823283017]],[\"3\",[0.42185037002185644,0.8501761337483956]],[\"4\",[0.970988442674195,-0.7383612841753642]],[\"5\",[-0.16684081724754224,-1.6602614828962428]],[\"6\",[-0.03461307511165935,1.143507735316856]],[\"7\",[0.9080846919865031,-0.4157506289828908]],[\"8\",[-0.047695128878838935,-1.261610649968805]],[\"9\",[0.8904816429602009,-1.498611119922551]],[\"10\",[0.10172497521553676,1.4957099695062672]],[\"11\",[-0.8184268799297008,1.7085568307568684]],[\"12\",[0.12417071246756561,1.5041749411547414]],[\"13\",[0.08827442521681117,-1.3334628653990515]],[\"14\",[-1.2798581887556006,1.0503537991390093]],[\"15\",[0.4694879783995627,-0.41644461280048606]],[\"16\",[-0.7599756538270568,1.7918012663679563]],[\"17\",[0.45557432873000453,-0.4511610497998102]],[\"18\",[-1.8061712084636898,0.5358339597221011]],[\"19\",[-0.2988333676073902,2.065349381543105]],[\"20\",[-0.26379624164406423,-1.0996397290777302]],[\"21\",[0.06341189530192765,1.5017634297474483]],[\"22\",[-0.18259026871474598,-1.6202642237791665]],[\"23\",[0.6489745799973031,0.37409898048562873]],[\"24\",[0.9610980307807464,-0.6713627884701636]],[\"25\",[-0.0010092846185522685,-0.8109646050825678]],[\"26\",[0.2191585303836169,1.2947418937088782]],[\"27\",[0.03830388085749221,1.66205420907971]],[\"28\",[-0.08959183631394242,-0.1805778943759554]],[\"29\",[-0.5083170889481803,-1.8667265914250661]],[\"30\",[0.986616206887394,-0.3047178348564925]],[\"31\",[1.2127316129754986,-1.5603240470800155]],[\"32\",[0.6216761661032147,-1.8448999543835227]],[\"33\",[0.223495794624484,1.2399009265288607]],[\"34\",[0.23211990978268587,0.5686352511314868]],[\"35\",[-0.8338168729698696,1.7773353753442578]],[\"36\",[-0.08005166968451846,1.941745724369899]],[\"37\",[0.062214617745139214,-1.657140195758858]],[\"38\",[0.48344807466104167,0.8334505990147101]],[\"39\",[1.6101299693745719,-1.6313377973443584]],[\"40\",[0.03621757421690418,1.691726598799145]],[\"41\",[-0.36107742239161744,-1.9251882204596007]],[\"42\",[-0.3510909476517835,-1.7093742013922248]],[\"43\",[0.7725399573439482,0.22295709096077126]],[\"44\",[-0.34448178774247035,2.2882256911046324]],[\"45\",[-1.4101858110319139,0.5377101014148036]],[\"46\",[-2.0773447305415207,0.5127656801759363]],[\"47\",[1.8758145806739288,-1.8466742196730193]],[\"48\",[-0.6087517625251723,2.9999999999999996]],[\"49\",[-0.6486128889807317,-2.087097825489553]]]}}},\"node_renderer\":{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p7857\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p7854\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p7855\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p7856\"},\"data\":{\"type\":\"map\",\"entries\":[[\"group\",[\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\",\"Group 3\",\"Group 1\",\"Group 2\"]],[\"pos\",[\"[0.6391927561230611, 0.9577081939968699]\",\"[0.7742314200503037, 0.8238027878749569]\",\"[0.5512540135677992, 0.9153577565777516]\",\"[0.25732444952662925, 0.1490908794049587]\",\"[0.6279736736286188, 0.10785263935795697]\",\"[0.914880874135223, 0.74802072749348]\",\"[0.26667034242888454, 0.35251088926605467]\",\"[0.5120452199291847, 0.03985698500510293]\",\"[0.6328184134196475, 0.7746435028883368]\",\"[0.9987954567088807, 0.14034841131467046]\",\"[0.12058819726432679, 0.13241169932006103]\",\"[0.22508864454761268, 0.6079455800309562]\",\"[0.11725977735714843, 0.1411642545174081]\",\"[0.919394680890905, 0.42166241371429103]\",\"[0.25836207293422575, 0.7258827350068001]\",\"[0.5172530162885245, 0.3000006396486069]\",\"[0.15298901445139557, 0.6194415824148096]\",\"[0.5503858383395439, 0.2963881895674162]\",\"[0.3129369332017773, 0.9521497094216804]\",\"[0.06277239466252482, 0.38715809100060106]\",\"[0.5823930417754898, 0.5818352592143017]\",\"[0.11925372303762072, 0.23308929448109605]\",\"[0.9639456495249757, 0.5916791753212867]\",\"[0.3588059198785577, 0.04938679848326155]\",\"[0.5490274008918267, 0.06720020630140588]\",\"[0.602429225429183, 0.4562231010090445]\",\"[0.09188236933925253, 0.047929414091373634]\",\"[0.06377074952137485, 0.2797118663108954]\",\"[0.4906626665413346, 0.4663320694792623]\",\"[0.6172692666513304, 0.6263884205262166]\",\"[0.49741178043085055, 0.01644316714237193]\",\"[0.7988288141788339, 0.059965469974182284]\",\"[0.7217970607301775, 0.2708413010090548]\",\"[0.011511237425662069, 0.028343774982744208]\",\"[0.38128826432700813, 0.2895041685619608]\",\"[0.07417032576014304, 0.636863466707599]\",\"[0.13676493570949355, 0.376877180606942]\",\"[0.9323413821739948, 0.8826597647832115]\",\"[0.3894337987935055, 0.26448206953319053]\",\"[0.8254125197486137, 0.05070075106309235]\",\"[0.2743690977404899, 0.29681368611004266]\",\"[0.941395039956768, 0.6449589631657039]\",\"[0.611788621954416, 0.4739744957109453]\",\"[0.2942512190192522, 0.0937049487795869]\",\"[0.23196649324366891, 0.16136392127068122]\",\"[0.08688565032870654, 0.75331756935591]\",\"[0.1510509171930553, 0.9969999993249501]\",\"[0.5439996268152169, 0.07151361555903424]\",\"[0.17916289804000352, 0.41119976642291267]\",\"[0.8746527854119696, 0.5086303883085483]\"]],[\"index\",[\"0\",\"1\",\"2\",\"3\",\"4\",\"5\",\"6\",\"7\",\"8\",\"9\",\"10\",\"11\",\"12\",\"13\",\"14\",\"15\",\"16\",\"17\",\"18\",\"19\",\"20\",\"21\",\"22\",\"23\",\"24\",\"25\",\"26\",\"27\",\"28\",\"29\",\"30\",\"31\",\"32\",\"33\",\"34\",\"35\",\"36\",\"37\",\"38\",\"39\",\"40\",\"41\",\"42\",\"43\",\"44\",\"45\",\"46\",\"47\",\"48\",\"49\"]],[\"radius\",[0.026122448979591838,0.036326530612244896,0.030204081632653063,0.05673469387755102,0.04244897959183673,0.03224489795918367,0.05673469387755102,0.04448979591836735,0.03224489795918367,0.026122448979591838,0.05061224489795918,0.036326530612244896,0.05061224489795918,0.030204081632653063,0.03224489795918367,0.05265306122448979,0.036326530612244896,0.05061224489795918,0.026122448979591838,0.04448979591836735,0.03428571428571429,0.05061224489795918,0.03224489795918367,0.05061224489795918,0.04653061224489796,0.04040816326530612,0.03836734693877551,0.04653061224489796,0.04040816326530612,0.036326530612244896,0.04244897959183673,0.03428571428571429,0.04244897959183673,0.036326530612244896,0.05877551020408163,0.03428571428571429,0.05061224489795918,0.028163265306122447,0.06285714285714286,0.03224489795918367,0.05469387755102041,0.03224489795918367,0.03836734693877551,0.05469387755102041,0.05265306122448979,0.030204081632653063,0.026122448979591838,0.04857142857142857,0.05061224489795918,0.036326530612244896]],[\"fill_color\",[\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\",\"#0DFF00\",\"#6510F4\",\"#FFFFFF\"]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p7858\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p7859\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"Circle\",\"id\":\"p7870\",\"attributes\":{\"radius\":{\"type\":\"field\",\"field\":\"radius\"},\"line_color\":{\"type\":\"value\",\"value\":\"#000000\"},\"line_width\":{\"type\":\"value\",\"value\":1.5},\"fill_color\":{\"type\":\"field\",\"field\":\"fill_color\"},\"fill_alpha\":{\"type\":\"value\",\"value\":0.9}}}}},\"edge_renderer\":{\"type\":\"object\",\"name\":\"GlyphRenderer\",\"id\":\"p7864\",\"attributes\":{\"data_source\":{\"type\":\"object\",\"name\":\"ColumnDataSource\",\"id\":\"p7861\",\"attributes\":{\"selected\":{\"type\":\"object\",\"name\":\"Selection\",\"id\":\"p7862\",\"attributes\":{\"indices\":[],\"line_indices\":[]}},\"selection_policy\":{\"type\":\"object\",\"name\":\"UnionRenderers\",\"id\":\"p7863\"},\"data\":{\"type\":\"map\",\"entries\":[[\"start\",[\"0\",\"0\",\"0\",\"1\",\"1\",\"1\",\"1\",\"1\",\"1\",\"1\",\"2\",\"2\",\"2\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"3\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"4\",\"5\",\"5\",\"5\",\"5\",\"5\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"6\",\"7\",\"7\",\"7\",\"7\",\"7\",\"7\",\"7\",\"7\",\"7\",\"7\",\"8\",\"8\",\"9\",\"9\",\"9\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"10\",\"11\",\"11\",\"11\",\"11\",\"11\",\"11\",\"11\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"12\",\"13\",\"13\",\"13\",\"13\",\"14\",\"14\",\"14\",\"14\",\"14\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"15\",\"16\",\"16\",\"16\",\"16\",\"16\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"17\",\"18\",\"19\",\"19\",\"19\",\"19\",\"19\",\"19\",\"19\",\"20\",\"20\",\"20\",\"20\",\"21\",\"21\",\"21\",\"21\",\"21\",\"21\",\"21\",\"21\",\"21\",\"21\",\"22\",\"22\",\"22\",\"23\",\"23\",\"23\",\"23\",\"23\",\"23\",\"23\",\"23\",\"23\",\"24\",\"24\",\"24\",\"24\",\"24\",\"24\",\"24\",\"24\",\"25\",\"25\",\"25\",\"25\",\"25\",\"25\",\"25\",\"26\",\"26\",\"26\",\"26\",\"27\",\"27\",\"27\",\"27\",\"27\",\"27\",\"28\",\"28\",\"28\",\"28\",\"28\",\"29\",\"29\",\"30\",\"30\",\"30\",\"30\",\"31\",\"31\",\"31\",\"32\",\"32\",\"32\",\"32\",\"33\",\"33\",\"34\",\"34\",\"34\",\"34\",\"34\",\"34\",\"34\",\"34\",\"35\",\"35\",\"35\",\"36\",\"36\",\"36\",\"36\",\"37\",\"38\",\"38\",\"38\",\"38\",\"38\",\"39\",\"40\",\"40\",\"40\",\"41\",\"42\",\"43\",\"43\",\"44\",\"45\"]],[\"end\",[\"1\",\"2\",\"8\",\"2\",\"5\",\"8\",\"22\",\"29\",\"37\",\"41\",\"8\",\"18\",\"29\",\"6\",\"7\",\"10\",\"12\",\"21\",\"23\",\"26\",\"27\",\"30\",\"33\",\"34\",\"36\",\"38\",\"40\",\"43\",\"44\",\"47\",\"48\",\"7\",\"15\",\"17\",\"23\",\"24\",\"30\",\"31\",\"32\",\"38\",\"39\",\"47\",\"8\",\"22\",\"37\",\"41\",\"49\",\"10\",\"11\",\"12\",\"15\",\"16\",\"17\",\"19\",\"21\",\"27\",\"28\",\"34\",\"36\",\"38\",\"40\",\"43\",\"44\",\"48\",\"15\",\"17\",\"23\",\"24\",\"30\",\"31\",\"34\",\"38\",\"43\",\"47\",\"20\",\"29\",\"13\",\"31\",\"39\",\"12\",\"19\",\"21\",\"23\",\"26\",\"27\",\"33\",\"36\",\"38\",\"40\",\"43\",\"44\",\"48\",\"14\",\"16\",\"19\",\"35\",\"36\",\"45\",\"48\",\"19\",\"21\",\"23\",\"26\",\"27\",\"33\",\"36\",\"38\",\"40\",\"43\",\"44\",\"48\",\"22\",\"32\",\"41\",\"49\",\"16\",\"18\",\"35\",\"45\",\"46\",\"17\",\"20\",\"23\",\"24\",\"25\",\"28\",\"30\",\"32\",\"34\",\"38\",\"40\",\"42\",\"47\",\"19\",\"35\",\"36\",\"45\",\"48\",\"20\",\"24\",\"25\",\"28\",\"30\",\"32\",\"34\",\"38\",\"40\",\"42\",\"47\",\"46\",\"21\",\"27\",\"35\",\"36\",\"40\",\"44\",\"48\",\"25\",\"28\",\"29\",\"42\",\"26\",\"27\",\"33\",\"34\",\"36\",\"38\",\"40\",\"43\",\"44\",\"48\",\"37\",\"41\",\"49\",\"24\",\"26\",\"30\",\"34\",\"38\",\"40\",\"43\",\"44\",\"47\",\"30\",\"31\",\"32\",\"34\",\"38\",\"39\",\"43\",\"47\",\"28\",\"29\",\"32\",\"34\",\"38\",\"42\",\"49\",\"27\",\"33\",\"43\",\"44\",\"33\",\"36\",\"40\",\"43\",\"44\",\"48\",\"29\",\"34\",\"38\",\"40\",\"42\",\"42\",\"49\",\"34\",\"38\",\"43\",\"47\",\"32\",\"39\",\"47\",\"39\",\"42\",\"47\",\"49\",\"43\",\"44\",\"36\",\"38\",\"40\",\"42\",\"43\",\"44\",\"47\",\"48\",\"36\",\"45\",\"48\",\"38\",\"40\",\"44\",\"48\",\"41\",\"40\",\"43\",\"44\",\"47\",\"48\",\"47\",\"43\",\"44\",\"48\",\"49\",\"49\",\"44\",\"47\",\"48\",\"46\"]]]}}},\"view\":{\"type\":\"object\",\"name\":\"CDSView\",\"id\":\"p7865\",\"attributes\":{\"filter\":{\"type\":\"object\",\"name\":\"AllIndices\",\"id\":\"p7866\"}}},\"glyph\":{\"type\":\"object\",\"name\":\"MultiLine\",\"id\":\"p7871\",\"attributes\":{\"line_color\":{\"type\":\"value\",\"value\":\"#000000\"},\"line_alpha\":{\"type\":\"value\",\"value\":0.3},\"line_width\":{\"type\":\"value\",\"value\":1.5}}}}},\"selection_policy\":{\"type\":\"object\",\"name\":\"NodesOnly\",\"id\":\"p7867\"},\"inspection_policy\":{\"type\":\"object\",\"name\":\"NodesOnly\",\"id\":\"p7868\"}}}],\"toolbar\":{\"type\":\"object\",\"name\":\"Toolbar\",\"id\":\"p7825\",\"attributes\":{\"logo\":null,\"tools\":[{\"type\":\"object\",\"name\":\"PanTool\",\"id\":\"p7838\"},{\"type\":\"object\",\"name\":\"WheelZoomTool\",\"id\":\"p7839\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"SaveTool\",\"id\":\"p7840\"},{\"type\":\"object\",\"name\":\"ResetTool\",\"id\":\"p7841\"},{\"type\":\"object\",\"name\":\"HoverTool\",\"id\":\"p7842\",\"attributes\":{\"renderers\":\"auto\"}},{\"type\":\"object\",\"name\":\"HoverTool\",\"id\":\"p7872\",\"attributes\":{\"renderers\":\"auto\",\"tooltips\":[[\"Node\",\"@index\"],[\"Group\",\"@group\"],[\"Centrality\",\"@radius{0.00}\"]]}}],\"active_scroll\":{\"id\":\"p7839\"}}},\"left\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p7833\",\"attributes\":{\"visible\":false,\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p7834\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p7835\"},\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p7836\"}}}],\"below\":[{\"type\":\"object\",\"name\":\"LinearAxis\",\"id\":\"p7828\",\"attributes\":{\"visible\":false,\"ticker\":{\"type\":\"object\",\"name\":\"BasicTicker\",\"id\":\"p7829\",\"attributes\":{\"mantissas\":[1,2,5]}},\"formatter\":{\"type\":\"object\",\"name\":\"BasicTickFormatter\",\"id\":\"p7830\"},\"major_label_policy\":{\"type\":\"object\",\"name\":\"AllLabels\",\"id\":\"p7831\"}}}],\"center\":[{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p7832\",\"attributes\":{\"visible\":false,\"axis\":{\"id\":\"p7828\"}}},{\"type\":\"object\",\"name\":\"Grid\",\"id\":\"p7837\",\"attributes\":{\"visible\":false,\"dimension\":1,\"axis\":{\"id\":\"p7833\"}}}],\"background_fill_color\":\"#F4F4F4\"}}]}}\n",
|
||
" </script>\n",
|
||
" <script type=\"text/javascript\">\n",
|
||
" (function() {\n",
|
||
" const fn = function() {\n",
|
||
" Bokeh.safely(function() {\n",
|
||
" (function(root) {\n",
|
||
" function embed_document(root) {\n",
|
||
" const docs_json = document.getElementById('c389e08f-4d74-4d25-a3d1-a7ad2850e518').textContent;\n",
|
||
" const render_items = [{\"docid\":\"4e8e9336-b495-4592-ae17-d9ab2bc2ac1d\",\"roots\":{\"p7816\":\"d64c1a7a-b66d-4c81-acec-907537260e4a\"},\"root_ids\":[\"p7816\"]}];\n",
|
||
" root.Bokeh.embed.embed_items(docs_json, render_items);\n",
|
||
" }\n",
|
||
" if (root.Bokeh !== undefined) {\n",
|
||
" embed_document(root);\n",
|
||
" } else {\n",
|
||
" let attempts = 0;\n",
|
||
" const timer = setInterval(function(root) {\n",
|
||
" if (root.Bokeh !== undefined) {\n",
|
||
" clearInterval(timer);\n",
|
||
" embed_document(root);\n",
|
||
" } else {\n",
|
||
" attempts++;\n",
|
||
" if (attempts > 100) {\n",
|
||
" clearInterval(timer);\n",
|
||
" console.log(\"Bokeh: ERROR: Unable to run BokehJS code because BokehJS library is missing\");\n",
|
||
" }\n",
|
||
" }\n",
|
||
" }, 10, root)\n",
|
||
" }\n",
|
||
" })(window);\n",
|
||
" });\n",
|
||
" };\n",
|
||
" if (document.readyState != \"loading\") fn();\n",
|
||
" else document.addEventListener(\"DOMContentLoaded\", fn);\n",
|
||
" })();\n",
|
||
" </script>\n",
|
||
" </body>\n",
|
||
"</html>\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 111
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-29T16:56:06.571404Z",
|
||
"start_time": "2024-12-29T16:56:06.569280Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"graph_engine = await get_graph_engine()\n",
|
||
"print(graph_url)"
|
||
],
|
||
"id": "8f69caa0e353a889",
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"https://hub.graphistry.com/graph/graph.html?dataset=cc21b1d2d6074323aa37af53e693b1a4&type=arrow&viztoken=db05565e-79e9-4fe3-99b2-b7a2e6d48eff&usertag=5f822e63-pygraphistry-0.33.9&splashAfter=1735491366&info=true\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 13
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "59e6c3c3",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### We can also do a search on the data to explore the knowledge."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"id": "e5e7dfc8",
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T13:44:16.575843Z",
|
||
"start_time": "2024-12-24T13:44:16.047897Z"
|
||
}
|
||
},
|
||
"source": [
|
||
"async def search(\n",
|
||
" vector_engine,\n",
|
||
" collection_name: str,\n",
|
||
" query_text: str = None,\n",
|
||
"):\n",
|
||
" query_vector = (await vector_engine.embedding_engine.embed_text([query_text]))[0]\n",
|
||
"\n",
|
||
" connection = await vector_engine.get_connection()\n",
|
||
" collection = await connection.open_table(collection_name)\n",
|
||
"\n",
|
||
" results = await collection.vector_search(query_vector).limit(10).to_pandas()\n",
|
||
"\n",
|
||
" result_values = list(results.to_dict(\"index\").values())\n",
|
||
"\n",
|
||
" return [dict(\n",
|
||
" id = str(result[\"id\"]),\n",
|
||
" payload = result[\"payload\"],\n",
|
||
" score = result[\"_distance\"],\n",
|
||
" ) for result in result_values]\n",
|
||
"\n",
|
||
"\n",
|
||
"from cognee.infrastructure.databases.vector import get_vector_engine\n",
|
||
"\n",
|
||
"vector_engine = get_vector_engine()\n",
|
||
"results = await search(vector_engine, \"entity_name\", \"sarah.nguyen@example.com\")\n",
|
||
"for result in results:\n",
|
||
" print(result)"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'id': '4d8dda57-2681-5264-a2bd-e2ddfe66a785', 'payload': {'id': '4d8dda57-2681-5264-a2bd-e2ddfe66a785', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'sarah nguyen'}, 'score': 0.5708460211753845}\n",
|
||
"{'id': '198e2ab8-75e9-5931-97ab-da9a5a8e188c', 'payload': {'id': '198e2ab8-75e9-5931-97ab-da9a5a8e188c', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'san francisco, ca'}, 'score': 1.349550724029541}\n",
|
||
"{'id': '435dbd37-ab20-503c-9e99-ab8b8a3484e5', 'payload': {'id': '435dbd37-ab20-503c-9e99-ab8b8a3484e5', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'senior data scientist'}, 'score': 1.3934645652770996}\n",
|
||
"{'id': '36a5e3c8-c5f5-5ab5-8d59-ea69d8b36932', 'payload': {'id': '36a5e3c8-c5f5-5ab5-8d59-ea69d8b36932', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'jessica miller'}, 'score': 1.4042469263076782}\n",
|
||
"{'id': '73ae630f-7b09-5dce-8c18-45d0a57b30f9', 'payload': {'id': '73ae630f-7b09-5dce-8c18-45d0a57b30f9', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'michael rodriguez'}, 'score': 1.4521806240081787}\n",
|
||
"{'id': '29e771c8-4c3f-52de-9511-6b705878e130', 'payload': {'id': '29e771c8-4c3f-52de-9511-6b705878e130', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'dr. emily carter'}, 'score': 1.471205472946167}\n",
|
||
"{'id': 'ce8b394a-b30e-52fc-b80a-6352edc60e5b', 'payload': {'id': 'ce8b394a-b30e-52fc-b80a-6352edc60e5b', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'stanford university'}, 'score': 1.4871069192886353}\n",
|
||
"{'id': '9780afb1-dccc-53eb-9a30-c0d4ce033711', 'payload': {'id': '9780afb1-dccc-53eb-9a30-c0d4ce033711', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'innovateai labs'}, 'score': 1.498255968093872}\n",
|
||
"{'id': '301b3cf8-5a5c-585e-80bd-f79901e4368c', 'payload': {'id': '301b3cf8-5a5c-585e-80bd-f79901e4368c', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'university of texas at austin'}, 'score': 1.5053001642227173}\n",
|
||
"{'id': '2c02c93c-9cd1-56b8-9cc0-55ff0b290e57', 'payload': {'id': '2c02c93c-9cd1-56b8-9cc0-55ff0b290e57', 'updated_at': datetime.datetime(2024, 12, 24, 11, 54, 13, 481297), 'topological_rank': 0, 'text': 'university of california, berkeley'}, 'score': 1.5213639736175537}\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 23
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "81fa2b00",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### We normalize search output scores so the lower the score of the search result is the higher the chance that it's what you're looking for. In the example above we have searched for node entities in the knowledge graph related to \"sarah.nguyen@example.com\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1b94ff96",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### In the example bellow we'll use cognee search to summarize information regarding the node most related to \"sarah.nguyen@example.com\" in the knowledge graph"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "21a3e9a6",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from cognee.api.v1.search import SearchType\n",
|
||
"\n",
|
||
"node = (await vector_engine.search(\"entity_name\", \"sarah.nguyen@example.com\"))[0]\n",
|
||
"node_name = node.payload[\"text\"]\n",
|
||
"\n",
|
||
"search_results = await cognee.search(SearchType.SUMMARIES, query_text = node_name)\n",
|
||
"print(\"\\n\\Extracted summaries are:\\n\")\n",
|
||
"for result in search_results:\n",
|
||
" print(f\"{result}\\n\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fd6e5fe2",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### In this example we'll use cognee search to find chunks in which the node most related to \"sarah.nguyen@example.com\" is a part of"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c7a8abff",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"search_results = await cognee.search(SearchType.CHUNKS, query_text = node_name)\n",
|
||
"print(\"\\n\\nExtracted chunks are:\\n\")\n",
|
||
"for result in search_results:\n",
|
||
" print(f\"{result}\\n\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "47f0112f",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### In this example we'll use cognee search to give us insights from the knowledge graph related to the node most related to \"sarah.nguyen@example.com\""
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "706a3954",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"search_results = await cognee.search(SearchType.INSIGHTS, query_text = node_name)\n",
|
||
"print(\"\\n\\nExtracted sentences are:\\n\")\n",
|
||
"for result in search_results:\n",
|
||
" print(f\"{result}\\n\")"
|
||
]
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T13:46:09.644509Z",
|
||
"start_time": "2024-12-24T13:46:04.538592Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"!pip install wget\n",
|
||
"!pip install deepeval\n",
|
||
"!pip install ujson"
|
||
],
|
||
"id": "afae18ac6a794925",
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Requirement already satisfied: wget in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (3.2)\r\n",
|
||
"\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m24.3.1\u001B[0m\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n",
|
||
"Requirement already satisfied: deepeval in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (2.0.9)\r\n",
|
||
"Requirement already satisfied: requests in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (2.32.3)\r\n",
|
||
"Requirement already satisfied: tqdm in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (4.67.1)\r\n",
|
||
"Requirement already satisfied: pytest in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (7.4.4)\r\n",
|
||
"Requirement already satisfied: tabulate in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.9.0)\r\n",
|
||
"Requirement already satisfied: typer in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.15.1)\r\n",
|
||
"Requirement already satisfied: rich in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (13.9.4)\r\n",
|
||
"Requirement already satisfied: protobuf in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (4.25.5)\r\n",
|
||
"Requirement already satisfied: pydantic in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (2.8.2)\r\n",
|
||
"Requirement already satisfied: sentry-sdk in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (2.19.2)\r\n",
|
||
"Requirement already satisfied: pytest-repeat in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.9.3)\r\n",
|
||
"Requirement already satisfied: pytest-xdist in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (3.6.1)\r\n",
|
||
"Requirement already satisfied: portalocker in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (3.0.0)\r\n",
|
||
"Requirement already satisfied: langchain in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.3.13)\r\n",
|
||
"Requirement already satisfied: langchain-core in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.3.28)\r\n",
|
||
"Requirement already satisfied: langchain_openai in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.2.14)\r\n",
|
||
"Requirement already satisfied: langchain-community in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.3.13)\r\n",
|
||
"Requirement already satisfied: docx2txt~=0.8 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (0.8)\r\n",
|
||
"Requirement already satisfied: importlib-metadata>=6.0.2 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (8.4.0)\r\n",
|
||
"Requirement already satisfied: tenacity<=9.0.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (8.5.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-api<2.0.0,>=1.24.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (1.27.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-sdk<2.0.0,>=1.24.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (1.27.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc<2.0.0,>=1.24.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (1.27.0)\r\n",
|
||
"Requirement already satisfied: grpcio==1.60.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (1.60.1)\r\n",
|
||
"Requirement already satisfied: nest-asyncio in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deepeval) (1.6.0)\r\n",
|
||
"Requirement already satisfied: zipp>=0.5 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from importlib-metadata>=6.0.2->deepeval) (3.21.0)\r\n",
|
||
"Requirement already satisfied: deprecated>=1.2.6 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-api<2.0.0,>=1.24.0->deepeval) (1.2.15)\r\n",
|
||
"Requirement already satisfied: googleapis-common-protos~=1.52 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-exporter-otlp-proto-grpc<2.0.0,>=1.24.0->deepeval) (1.66.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.27.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-exporter-otlp-proto-grpc<2.0.0,>=1.24.0->deepeval) (1.27.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-proto==1.27.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-exporter-otlp-proto-grpc<2.0.0,>=1.24.0->deepeval) (1.27.0)\r\n",
|
||
"Requirement already satisfied: opentelemetry-semantic-conventions==0.48b0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-sdk<2.0.0,>=1.24.0->deepeval) (0.48b0)\r\n",
|
||
"Requirement already satisfied: typing-extensions>=3.7.4 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from opentelemetry-sdk<2.0.0,>=1.24.0->deepeval) (4.12.2)\r\n",
|
||
"Requirement already satisfied: PyYAML>=5.3 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (6.0.2)\r\n",
|
||
"Requirement already satisfied: SQLAlchemy<3,>=1.4 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (2.0.35)\r\n",
|
||
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (3.10.10)\r\n",
|
||
"Requirement already satisfied: langchain-text-splitters<0.4.0,>=0.3.3 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (0.3.4)\r\n",
|
||
"Requirement already satisfied: langsmith<0.3,>=0.1.17 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (0.2.4)\r\n",
|
||
"Requirement already satisfied: numpy<2,>=1.22.4 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain->deepeval) (1.26.4)\r\n",
|
||
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain-core->deepeval) (1.33)\r\n",
|
||
"Requirement already satisfied: packaging<25,>=23.2 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain-core->deepeval) (24.2)\r\n",
|
||
"Requirement already satisfied: annotated-types>=0.4.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pydantic->deepeval) (0.7.0)\r\n",
|
||
"Requirement already satisfied: pydantic-core==2.20.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pydantic->deepeval) (2.20.1)\r\n",
|
||
"Requirement already satisfied: charset-normalizer<4,>=2 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from requests->deepeval) (3.4.0)\r\n",
|
||
"Requirement already satisfied: idna<4,>=2.5 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from requests->deepeval) (3.10)\r\n",
|
||
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from requests->deepeval) (2.2.3)\r\n",
|
||
"Requirement already satisfied: certifi>=2017.4.17 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from requests->deepeval) (2024.12.14)\r\n",
|
||
"Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain-community->deepeval) (0.6.7)\r\n",
|
||
"Requirement already satisfied: httpx-sse<0.5.0,>=0.4.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain-community->deepeval) (0.4.0)\r\n",
|
||
"Requirement already satisfied: pydantic-settings<3.0.0,>=2.4.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain-community->deepeval) (2.7.0)\r\n",
|
||
"Requirement already satisfied: openai<2.0.0,>=1.58.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain_openai->deepeval) (1.58.1)\r\n",
|
||
"Requirement already satisfied: tiktoken<1,>=0.7 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langchain_openai->deepeval) (0.7.0)\r\n",
|
||
"Requirement already satisfied: iniconfig in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pytest->deepeval) (2.0.0)\r\n",
|
||
"Requirement already satisfied: pluggy<2.0,>=0.12 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pytest->deepeval) (1.5.0)\r\n",
|
||
"Requirement already satisfied: execnet>=2.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pytest-xdist->deepeval) (2.1.1)\r\n",
|
||
"Requirement already satisfied: markdown-it-py>=2.2.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from rich->deepeval) (3.0.0)\r\n",
|
||
"Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from rich->deepeval) (2.18.0)\r\n",
|
||
"Requirement already satisfied: click>=8.0.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from typer->deepeval) (8.1.7)\r\n",
|
||
"Requirement already satisfied: shellingham>=1.3.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from typer->deepeval) (1.5.4)\r\n",
|
||
"Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (2.4.4)\r\n",
|
||
"Requirement already satisfied: aiosignal>=1.1.2 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (1.3.2)\r\n",
|
||
"Requirement already satisfied: attrs>=17.3.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (24.3.0)\r\n",
|
||
"Requirement already satisfied: frozenlist>=1.1.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (1.5.0)\r\n",
|
||
"Requirement already satisfied: multidict<7.0,>=4.5 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (6.1.0)\r\n",
|
||
"Requirement already satisfied: yarl<2.0,>=1.12.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (1.18.3)\r\n",
|
||
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community->deepeval) (3.23.2)\r\n",
|
||
"Requirement already satisfied: typing-inspect<1,>=0.4.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community->deepeval) (0.9.0)\r\n",
|
||
"Requirement already satisfied: wrapt<2,>=1.10 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from deprecated>=1.2.6->opentelemetry-api<2.0.0,>=1.24.0->deepeval) (1.17.0)\r\n",
|
||
"Requirement already satisfied: jsonpointer>=1.9 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from jsonpatch<2.0,>=1.33->langchain-core->deepeval) (3.0.0)\r\n",
|
||
"Requirement already satisfied: httpx<1,>=0.23.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langsmith<0.3,>=0.1.17->langchain->deepeval) (0.27.0)\r\n",
|
||
"Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langsmith<0.3,>=0.1.17->langchain->deepeval) (3.10.12)\r\n",
|
||
"Requirement already satisfied: requests-toolbelt<2.0.0,>=1.0.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from langsmith<0.3,>=0.1.17->langchain->deepeval) (1.0.0)\r\n",
|
||
"Requirement already satisfied: mdurl~=0.1 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from markdown-it-py>=2.2.0->rich->deepeval) (0.1.2)\r\n",
|
||
"Requirement already satisfied: anyio<5,>=3.5.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from openai<2.0.0,>=1.58.1->langchain_openai->deepeval) (4.7.0)\r\n",
|
||
"Requirement already satisfied: distro<2,>=1.7.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from openai<2.0.0,>=1.58.1->langchain_openai->deepeval) (1.9.0)\r\n",
|
||
"Requirement already satisfied: jiter<1,>=0.4.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from openai<2.0.0,>=1.58.1->langchain_openai->deepeval) (0.5.0)\r\n",
|
||
"Requirement already satisfied: sniffio in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from openai<2.0.0,>=1.58.1->langchain_openai->deepeval) (1.3.1)\r\n",
|
||
"Requirement already satisfied: python-dotenv>=0.21.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from pydantic-settings<3.0.0,>=2.4.0->langchain-community->deepeval) (1.0.1)\r\n",
|
||
"Requirement already satisfied: regex>=2022.1.18 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from tiktoken<1,>=0.7->langchain_openai->deepeval) (2024.11.6)\r\n",
|
||
"Requirement already satisfied: httpcore==1.* in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from httpx<1,>=0.23.0->langsmith<0.3,>=0.1.17->langchain->deepeval) (1.0.7)\r\n",
|
||
"Requirement already satisfied: h11<0.15,>=0.13 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->langsmith<0.3,>=0.1.17->langchain->deepeval) (0.14.0)\r\n",
|
||
"Requirement already satisfied: mypy-extensions>=0.3.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community->deepeval) (1.0.0)\r\n",
|
||
"Requirement already satisfied: propcache>=0.2.0 in /Users/vasilije/cognee/.venv/lib/python3.11/site-packages (from yarl<2.0,>=1.12.0->aiohttp<4.0.0,>=3.8.3->langchain->deepeval) (0.2.1)\r\n",
|
||
"\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m24.3.1\u001B[0m\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n",
|
||
"Collecting ujson\r\n",
|
||
" Obtaining dependency information for ujson from https://files.pythonhosted.org/packages/8d/9f/4731ef0671a0653e9f5ba18db7c4596d8ecbf80c7922dd5fe4150f1aea76/ujson-5.10.0-cp311-cp311-macosx_11_0_arm64.whl.metadata\r\n",
|
||
" Downloading ujson-5.10.0-cp311-cp311-macosx_11_0_arm64.whl.metadata (9.3 kB)\r\n",
|
||
"Downloading ujson-5.10.0-cp311-cp311-macosx_11_0_arm64.whl (51 kB)\r\n",
|
||
"\u001B[2K \u001B[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001B[0m \u001B[32m51.8/51.8 kB\u001B[0m \u001B[31m1.7 MB/s\u001B[0m eta \u001B[36m0:00:00\u001B[0m\r\n",
|
||
"\u001B[?25hInstalling collected packages: ujson\r\n",
|
||
"Successfully installed ujson-5.10.0\r\n",
|
||
"\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m24.3.1\u001B[0m\r\n",
|
||
"\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 29
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T15:29:11.123483Z",
|
||
"start_time": "2024-12-24T15:29:11.120888Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"from evals.eval_on_hotpot import eval_on_hotpotQA\n",
|
||
"from evals.eval_on_hotpot import answer_with_cognee\n",
|
||
"from evals.eval_on_hotpot import answer_without_cognee\n",
|
||
"from evals.eval_on_hotpot import eval_answers\n",
|
||
"from cognee.base_config import get_base_config\n",
|
||
"from pathlib import Path\n",
|
||
"from tqdm import tqdm\n",
|
||
"import wget\n",
|
||
"import json\n",
|
||
"import statistics"
|
||
],
|
||
"id": "5f36b67668fdb646",
|
||
"outputs": [],
|
||
"execution_count": 2
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T15:57:30.764970Z",
|
||
"start_time": "2024-12-24T15:57:07.861187Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"answer_provider = answer_without_cognee # For native LLM answers use answer_without_cognee\n",
|
||
"num_samples = 10 # With cognee, it takes ~1m10s per sample\n",
|
||
"\n",
|
||
"base_config = get_base_config()\n",
|
||
"data_root_dir = base_config.data_root_directory\n",
|
||
"\n",
|
||
"if not Path(data_root_dir).exists():\n",
|
||
" Path(data_root_dir).mkdir()\n",
|
||
"\n",
|
||
"filepath = data_root_dir / Path(\"hotpot_dev_fullwiki_v1.json\")\n",
|
||
"if not filepath.exists():\n",
|
||
" url = 'http://curtis.ml.cmu.edu/datasets/hotpot/hotpot_dev_fullwiki_v1.json'\n",
|
||
" wget.download(url, out=data_root_dir)\n",
|
||
"\n",
|
||
"with open(filepath, \"r\") as file:\n",
|
||
" dataset = json.load(file)\n",
|
||
"instances = dataset if not num_samples else dataset[:num_samples]\n",
|
||
"answers = []\n",
|
||
"for instance in tqdm(instances, desc=\"Getting answers\"):\n",
|
||
" answer = await answer_provider(instance)\n",
|
||
" answers.append(answer)"
|
||
],
|
||
"id": "d5af4b516c6621a3",
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Getting answers: 100%|██████████| 10/10 [00:13<00:00, 1.31s/it]\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 9
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T15:57:30.787382Z",
|
||
"start_time": "2024-12-24T15:57:30.785259Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"from evals.deepeval_metrics import f1_score_metric\n",
|
||
"from evals.deepeval_metrics import em_score_metric"
|
||
],
|
||
"id": "2bf69048a272158c",
|
||
"outputs": [],
|
||
"execution_count": 10
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T15:57:30.828509Z",
|
||
"start_time": "2024-12-24T15:57:30.795197Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"f1_metric = f1_score_metric()\n",
|
||
"eval_results = await eval_answers(instances, answers, f1_metric)\n",
|
||
"avg_f1_score = statistics.mean([result.metrics_data[0].score for result in eval_results.test_results])\n",
|
||
"print(\"F1 score: \", avg_f1_score)"
|
||
],
|
||
"id": "72ba5f89cccbee6b",
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"✨ You're running DeepEval's latest \u001B[38;2;106;0;255mOfficial hotpot F1 score Metric\u001B[0m! \u001B[1;38;2;55;65;81m(\u001B[0m\u001B[38;2;55;65;81musing \u001B[0m\u001B[3;38;2;55;65;81mNone\u001B[0m\u001B[38;2;55;65;81m, \u001B[0m\u001B[38;2;55;65;81mstrict\u001B[0m\u001B[38;2;55;65;81m=\u001B[0m\u001B[3;38;2;55;65;81mFalse\u001B[0m\u001B[38;2;55;65;81m, \u001B[0m\u001B[38;2;55;65;81masync_mode\u001B[0m\u001B[38;2;55;65;81m=\u001B[0m\u001B[3;38;2;55;65;81mTrue\u001B[0m\u001B[1;38;2;55;65;81m)\u001B[0m\u001B[38;2;55;65;81m...\u001B[0m\n"
|
||
],
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">✨ You're running DeepEval's latest <span style=\"color: #6a00ff; text-decoration-color: #6a00ff\">Official hotpot F1 score Metric</span>! <span style=\"color: #374151; text-decoration-color: #374151; font-weight: bold\">(</span><span style=\"color: #374151; text-decoration-color: #374151\">using </span><span style=\"color: #374151; text-decoration-color: #374151; font-style: italic\">None</span><span style=\"color: #374151; text-decoration-color: #374151\">, </span><span style=\"color: #374151; text-decoration-color: #374151\">strict</span><span style=\"color: #374151; text-decoration-color: #374151\">=</span><span style=\"color: #374151; text-decoration-color: #374151; font-style: italic\">False</span><span style=\"color: #374151; text-decoration-color: #374151\">, </span><span style=\"color: #374151; text-decoration-color: #374151\">async_mode</span><span style=\"color: #374151; text-decoration-color: #374151\">=</span><span style=\"color: #374151; text-decoration-color: #374151; font-style: italic\">True</span><span style=\"color: #374151; text-decoration-color: #374151; font-weight: bold\">)</span><span style=\"color: #374151; text-decoration-color: #374151\">...</span>\n",
|
||
"</pre>\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Event loop is already running. Applying nest_asyncio patch to allow async execution...\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Evaluating 10 test case(s) in parallel: |██████████|100% (10/10) [Time Taken: 00:00, 407.84test case/s]"
|
||
]
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.0, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: Were Scott Derrickson and Ed Wood of the same nationality?\n",
|
||
" - actual output: Scott Derrickson is described as an American filmmaker in the context. Ed Wood is referenced as a cult filmmaker in the American biographical film \"Ed Wood.\" Therefore, both Scott Derrickson and Ed Wood are of the same nationality: American.\n",
|
||
" - expected output: yes\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.04255319148936171, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: What government position was held by the woman who portrayed Corliss Archer in the film Kiss and Tell?\n",
|
||
" - actual output: The woman who portrayed Corliss Archer in the film \"Kiss and Tell\" was Shirley Temple. After her acting career, she held the position of U.S. Ambassador to Czechoslovakia from 1989 to 1992, and she was also a member of the U.S. delegation to the United Nations from 1969 to 1970.\n",
|
||
" - expected output: Chief of Protocol\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.05882352941176471, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: What science fantasy young adult series, told in first person, has a set of companion books narrating the stories of enslaved worlds and alien species?\n",
|
||
" - actual output: The book series that fits the description is \"Animorphs\". It is a science fantasy young adult series told in the first person with multiple narrators, and it deals with themes relevant to enslaved worlds and alien species.\n",
|
||
" - expected output: Animorphs\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.0, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: Are the Laleli Mosque and Esma Sultan Mansion located in the same neighborhood?\n",
|
||
" - actual output: The Laleli Mosque is located in Laleli, Fatih, Istanbul. The Esma Sultan Mansion is located in the Ortaköy neighborhood of Istanbul. Therefore, they are not located in the same neighborhood.\n",
|
||
" - expected output: no\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.22222222222222218, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: The director of the romantic comedy \"Big Stone Gap\" is based in what New York city?\n",
|
||
" - actual output: The director of the romantic comedy \"Big Stone Gap\" is Adriana Trigiani, and there is no specific New York city mentioned in the provided context.\n",
|
||
" - expected output: Greenwich Village, New York City\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.11764705882352941, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: 2014 S/S is the debut album of a South Korean boy group that was formed by who?\n",
|
||
" - actual output: The South Korean boy group that debuted with the album \"2014 S/S\" was formed by FNC Entertainment.\n",
|
||
" - expected output: YG Entertainment\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.18181818181818182, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: Who was known by his stage name Aladin and helped organizations improve their performance as a consultant?\n",
|
||
" - actual output: Eenasul Fateh, known by his stage name Aladin, is a Bangladeshi-British cultural practitioner, magician, live artist, and former international management consultant.\n",
|
||
" - expected output: Eenasul Fateh\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.0, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: The arena where the Lewiston Maineiacs played their home games can seat how many people?\n",
|
||
" - actual output: The Lewiston Maineiacs played their home games at the Androscoggin Bank Colisée. The capacity of the arena is not directly stated in the context, but it is implied from the information provided.\n",
|
||
" - expected output: 3,677 seated\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.12903225806451613, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: Who is older, Annie Morton or Terry Richardson?\n",
|
||
" - actual output: Annie Morton was born on October 8, 1970, while Terry Richardson's birthdate is not provided in the context. Therefore, based on the available information, Annie Morton is older than Terry Richardson.\n",
|
||
" - expected output: Terry Richardson\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Metrics Summary\n",
|
||
"\n",
|
||
" - ❌ Official hotpot F1 score (score: 0.0, threshold: 0.5, strict: False, evaluation model: None, reason: None, error: None)\n",
|
||
"\n",
|
||
"For test case:\n",
|
||
"\n",
|
||
" - input: Are Local H and For Against both from the United States?\n",
|
||
" - actual output: Yes, Local H is from the United States, specifically mentioned as a band from the Chicago suburbs, and For Against is also a band mentioned as being from the United States, although specific details about For Against's origin are not provided in the context.\n",
|
||
" - expected output: yes\n",
|
||
" - context: None\n",
|
||
" - retrieval context: None\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n",
|
||
"Overall Metric Pass Rates\n",
|
||
"\n",
|
||
"Official hotpot F1 score: 0.00% pass rate\n",
|
||
"\n",
|
||
"======================================================================\n",
|
||
"\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"\u001B[38;2;5;245;141m✓\u001B[0m Tests finished 🎉! Run \u001B[32m'deepeval login'\u001B[0m to save and analyze evaluation results on Confident AI. \n",
|
||
"‼️ Friendly reminder 😇: You can also run evaluations with ALL of deepeval's metrics directly on Confident AI \n",
|
||
"instead.\n"
|
||
],
|
||
"text/html": [
|
||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #05f58d; text-decoration-color: #05f58d\">✓</span> Tests finished 🎉! Run <span style=\"color: #008000; text-decoration-color: #008000\">'deepeval login'</span> to save and analyze evaluation results on Confident AI. \n",
|
||
"‼️ Friendly reminder 😇: You can also run evaluations with ALL of deepeval's metrics directly on Confident AI \n",
|
||
"instead.\n",
|
||
"</pre>\n"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
},
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"F1 score: 0.0752096441829576\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 11
|
||
},
|
||
{
|
||
"metadata": {
|
||
"ExecuteTime": {
|
||
"end_time": "2024-12-24T15:26:14.946766Z",
|
||
"start_time": "2024-12-24T15:26:14.944741Z"
|
||
}
|
||
},
|
||
"cell_type": "code",
|
||
"source": [
|
||
"for n in range(1,4):\n",
|
||
" print(n)"
|
||
],
|
||
"id": "783985c35d1126de",
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"1\n",
|
||
"2\n",
|
||
"3\n"
|
||
]
|
||
}
|
||
],
|
||
"execution_count": 38
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "288ab570",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Give us a star if you like it!\n",
|
||
"https://github.com/topoteretes/cognee"
|
||
]
|
||
},
|
||
{
|
||
"metadata": {},
|
||
"cell_type": "code",
|
||
"outputs": [],
|
||
"execution_count": null,
|
||
"source": " 0.0667",
|
||
"id": "d042efe5d38144fa"
|
||
},
|
||
{
|
||
"metadata": {},
|
||
"cell_type": "code",
|
||
"outputs": [],
|
||
"execution_count": null,
|
||
"source": "0.085",
|
||
"id": "9436af97520e0ae"
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": ".venv",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.11.8"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|