Category

Blog

Milvus 1.1 Release Now Available!

By Blog

Milvus, an LF AI & Data Foundation Incubation Project, has released version 1.1! Milvus is an open source vector database that is highly flexible, reliable, and blazing fast. It supports adding, deleting, updating, and near real-time search of vectors on a trillion-byte scale.

In version 1.1, Milvus adds a variety of improvements. Highlights include:

Improvements

  • #4756 Improves the performance of the get_entity_by_id() method call.
  • #4856 Upgrades hnswlib to v0.5.0.
  • #4958 Improves the performance of IVF index training.

New Features

  • #4564 Supports specifying partition in a get_entity_by_id() method call.
  • #4806 Supports specifying partition in a delete_entity_by_id() method call.
  • #4905 Adds the release_collection() method, which unloads a specific collection from cache.

Fixed Issues

  • #4778 Fails to access vector index in Mishards.
  • #4797 The system returns false results after merging search requests with different topK parameters.
  • #4838 The server does not respond immediately to an index building request on an empty collection.
  • #4858 For GPU-enabled Milvus, the system crashes on a search request with a large topK (> 2048).
  • #4862 A read-only node merges segments during startup.
  • #4894 The capacity of a Bloom filter does not equal to the row count of the segment it belongs to.
  • #4908 The GPU cache is not cleaned up after a collection is dropped.
  • #4933 It takes a long while for the system to build index for a small segment.
  • #4952 Fails to set timezone as “UTC + 5:30”.
  • #5008 The system crashes randomly during continuous, concurrent delete, insert, and search operations.
  • #5010 For GPU-enabled Milvus, query fails on IVF_PQ if nbits ≠ 8.
  • #5050 get_collection_stats() returns false index type for segments still in the process of index building.
  • #5063 The system crashes when an empty segment is flushed.
  • #5078 For GPU-enabled Milvus, the system crashes when creating an IVF index on vectors of 2048, 4096, or 8192 dimensions.

As usual, there is strong support for Milvus from our fantastic open source community! We thank the following individuals for making their pull request part of Milvus 1.1:

To learn more about the Milvus 1.1 release, check out the full release notes. Want to get involved with Milvus? Be sure to join the Milvus-Announce and Milvus-Technical-Discuss mailing lists to join the community and stay connected on the latest updates. 

Congratulations to the Milvus team and we look forward to continued growth and success as part of the LF AI & Data Foundation! To learn about hosting an open source project with us, visit the LF AI & Data Foundation website.

Milvus Key Links

LF AI & Data Resources

Trusted AI Principles – RREPEATS Practical Examples Review

By Blog

Guest Author: Susan Malaika, LF AI & Data Trusted AI Committee Member

On April 28, 2021, the LF AI & Data Trusted AI Principles Working Group hosted a session about applying the eight RREPEATS principles to two examples; one in the context of network providers, the other in banking. The Practical Examples session is a follow-up from the session in February that introduced the RREPEATS principles

After the presentation of the two examples, a round table discussion took place, covering the application of trusted AI principles in companies. A key outcome from the discussion was the recognition that low-level tactical tools can be used to implement trusted AI principles. Another outcome was the importance of having an ethics board in corporations.

Example: Classification of Encrypted Traffic Application

Iman Akbari Azirani and Noura Limam from the University of Waterloo in Canada, and Bertrand Mathieu from Orange Labs in France, explained that the Internet data payloads and, increasingly, headers are encrypted. Typically, 85% of Internet traffic is encrypted, and Google services are 100% encrypted, making it difficult to classify network traffic. In order to anticipate workloads and offer good services to customers, network operators need to understand the traffic by:

  • Providing accurate capacity planning
  • Detecting fraud, such as, attempting to mask data that would normally be paid for, as free traffic. A common example of fraud is transmitting video over a text service.

Bertrand, Noura, and Iman discussed three of the RREPEATS principles in the context of the encrypted traffic application; Equitability, Privacy, and Explainability.

To ensure equitability and privacy, significant amounts of information (e.g., temporal information, IP addresses, payloads, security certificates etc.) are removed from the data. For explainability, it is important to understand that classification is applied at the global level (service and application) and not at an individual level. Deep neural networks are used in classification and focus on three types of features (3 faceted models): Transport Layer Security (TLS) handshake bytes, traffic shape (size, direction, inter arrival times), and statistical features. In spite of being focused on protocol agnostic features, the classification accuracy of this approach is high.

This section ended with request for tools that:

  • Help with network classification 
  • Support efficient/speedier models running in the network
  • Offer better explanations 
  • Prevent attacks on model 

Example: RosaeNLG Framework (an LF AI & Data project)

Ludan Stoecklé, CTO of Data & AI Lab BNP Paribas CIB and initial author of RosaeNLG, outlined how to explain a decision to a non-expert user, focused on trusting the decision and being transparent. Apparently, it has been shown that non-expert users most often find textual explanations easier to understand rather than the output from corporate style dashboards, such as tables and graphs. Also, computer generated texts are often preferred over human written text, as the generated texts are clearer, less ambiguous, and more concise.

Ludan illustrated his points via credit application rejection and showed simple text generated via Natural Language Generation to explain the reason for rejection. The following considerations apply in preparing an explanation for a decision:

  • Interpret the decision
  • Define what to say
  • Define how to say it 

The RosaeNLG framework (an LF AI & Data Sandbox project) provides template-based Natural Language Generation. It automates the production of relatively repetitive texts based on structured input data and textual templates, and is widely used in the financial industry.

Dataset Discussion 

There was an important discussion about open datasets in the areas of network encryption and NLG, and the absence of open networking datasets that could move the field forward. Typically networking datasets are proprietary and complex. It is difficult to create a synthetic dataset that mimics real network traces, with many sessions from many users and with variable network conditions that end-users experience. However, Orange is investigating the possibility of opening a part of a completely anonymized dataset (and without any possibility to reverse engineer it and discover private information).

Round Table – Applying Trusted AI Principles in a Corporation

The first part of the round table discussion was focused on the examples presented. An observation that principles have different meaning in the context of different examples and that understanding the context of different scenarios is key. Souad Ouali, the Trusted AI Principles Working Group Chair, responded that the RREPEATS principles were very carefully and deliberately worded to be applicable in a general way in international and varied settings. The round table panelists agreed that Trusted AI principles must be incorporated into the DNA of what we do throughout the AI life cycle and it is important to ask questions relating to principles at every step. An important observation from listening to the two practical example presentations is how tactical low-level tools map to high-level Trusted AI principles.

The round table ended with a discussion on corporate ethics boards and whether they should be geographically diverse or distributed, incorporate domain experts, and extend into the operational aspect of a business. Another consideration is whether an ethics board should include representatives from other institutions.

Join Us for Our Next Session

Please join us for The Trusted AI Principles – Tools and Techniques webinar on September 15, 2021. Register here!

Stay connected with the Trusted AI Committee by joining the mailing list here and attend one of our upcoming meetings! Learn more here.

RREPEATS Links

LF AI & Data Resources

Data Intelligence – AI-based Automation

By Blog

Guest Author: Dr. Jagreet Kaur, Chief AI Officer, Xenonstack

What is Data Intelligence?

The world is leading towards data-driven intelligence. To stand in the world of evolving technology and the competition phase, organizations must make data and AI-based decisions. It becomes difficult for the organizations that are not working on those aspects and data to know the facts and insights while making decisions.

Data Intelligence enables the process of multisource data and generates meaningful insights that would help to make valuable decisions. It allows combining unstructured data and text analytics results with structured data to use for predictive analytics. It can give a real-time statistical analysis of structured or unstructured data to understand data patterns and dependencies.

Why Do You Need It?

Data intelligence is required to process and understand the data. Data intelligence is rapidly becoming one of the most important elements of big data. Data intelligence has progressed from the infantile stage to a point where it can handle vast amounts of data with intelligence. It isn’t going to fold its wings either; the immediate positive results have attracted many organizations’ attention. Various entrepreneurs have expressed interest in using and developing data intelligence to make intelligent decisions in driving their business. There are multiple cases where we may need it. Some instances have discussed that helps to know why we require data intelligence:

  • Artificial Intelligence: Using a machine learning algorithm helps to find the predictive analysis and to recognize correlation. It helps to find domain-specific custom entities and word usage.
  • Intuitive Visualization: It allows us to understand data effectively in less time using informative intuitive charts and graphs. Visualization helps to understand complex data within seconds rather than reading and understanding an excel or any other data file. Visualization also generates insights and clear data patterns that are difficult to find in tables or datasets. It enables to easily filter and drill down the reports according to the requirements.
  • Insight generation: Based on the collected data, it allows generating or taking insight from the visualization that helps to understand the business progress and customer needs.
  • Data-driven decision-making: To make better and data-driven decisions so that those correct decisions can be taken to gain customer satisfaction and revenue.

The Base Foundation For Data Intelligence

Data intelligence is an optimized way that provides an unconventional 360 view of the business environment. It helps to understand the customer requirements better and also monitors the organization’s performance. Based on the data or insights, make decisions according to customer preferences and improve its revenue and benefits. Data intelligence is based on several sets of techniques accordingly to enrich business decisions and processes. These are:

  • Descriptive (“What happened”): It is used to review and examine the historical and real-time data to understand business performance and customer behavior. It detects a particular occurrence of a situation.
  • Diagnostic (“How it happened”): To know the reason for the occurrence of a particular instance or situation.
  • Predictive (“What could happen”): It uses historical and, based on that, predicts future occurrence using some ML algorithm.
  • Prescriptive (“What Should We Do”): To develop and analyze the alternative knowledge that can be applied in the course of action. It helps us to understand what to do in the future.
  • Decisive: Decisive analytics helps to measure data suitability. It chooses the recommended action to implement it in the environment and real-time process when there are multiple possibilities.

How Can I Use Data Intelligence?

Data intelligence performs the following steps to identify relations and mentions of unstructured or structured data.

  • Data Ingestion: It collects structured and unstructured data from different sources such as documents, emails, databases, websites, and data repositories. Data can be inserted into the application or platform manually or scheduled at fixed intervals of time. Data can be processed and used by that application to perform tasks.
  • Data Processing: Now, data collected from sources can be processed and used to generate insights. It makes it possible to find a relation between data. Several tools provide an easy-to-use interface for creating custom models to train and test the model to find entities and data relations. It allows using models for future predictive analytics.
  • Reporting and Visualizations: Reporting and Visualization is the final step that analyzes the data using charts and graphs. Visualization makes it easy to understand large and complex data effectively.

The Benefits Of Data Intelligence

Data intelligence gives wings to the technology by providing intelligence in their daily tasks and decisions. Let’s discuss the benefits of data intelligence and why organizations should embrace them:

  • Changing demands: Data intelligence makes the organization adopt the dynamic changes of the industries. The business nowadays is continuously evolving. To stand in the competition and reduce the chances of failure, organizations must accept and update the newly emerging trends. For example, the trend of the adoption of selfie cameras in smartphones was increasing. Mobile businesses that do not capitalize on the trend are doomed to fail. Data intelligence helps organizations to understand customer behavior and change. Firms are informed about repeated changes and the pattern of occurrence by smart adaptive dynamics. It allows the company to make informed decisions based on the analysis.
  • Strong Foundation of Data: Data intelligence makes big data more strong and strengthened by restructuring the process of data arrangements. It allows to gather insights from big data and then render optimized engagement capability.
  • Useful Data: No doubt the world is generating a large volume of data every day that can change the world and improve the services according to customer demands and preference. But most of the data is not in the form of use. It is not possible to directly use that data. It is required to transform it into a useful form to use data. Data intelligence is also in charge of converting raw data into cumulative information. Data intelligence cleans and transforms data into smart capsules of ready-to-use data that can be used in the company to assess results. Data intelligence makes it possible not to worry about defining particular cases to the computers.
  • Augmented Analytics: Advanced statistical approaches are used in data intelligence to advance visualized predictive and prescriptive analytics. Instead of building a complete application every time, it automates the data processing that can be completed just by doing some simple steps. If required, further changes may be recommended based on the results. There is no way for business plans to fail with such extensive planning for a real-life scenario. Advanced simulations enable businesses to predict potential outcomes and make changes to prescriptions as needed.
  • Accelerate innovation: Data intelligence makes it possible to accelerate innovations by making smart use of data. It allows using data insights to drive business innovations and use them to develop their services by considering customer preferences and requirements.

What is the difference between data information and intelligence? 

  • Data: It is the raw form of data recorded truth at a point in time. It might be a conversation, a purchase, or an interaction with your company’s website. Data is the compilation of results from those incidents that are then quantifiably recorded so that companies can review them easily. 
  • Information: It is a collection of data or a way of bringing data together. When data is picked from an event and put into narrative forms, it helps to answer the following questions: 
    • What is the churn rate of employees?
    • How long is the sales cycle of an organization?
      The information helps to answer these questions that move the business.
  • Intelligence: Intelligence is a group of information to derive intelligence or decisions in their application or tasks. For instance, suppose you are selling more in southern regions, then the smart and intelligent answer will be why that might be. To get an answer, it will look at numbers such as the number of events, amount spent on advertisements, marketing campaigns southern region clients receive. After that, it can be compared with the other region (North region). Through this analysis, we get to know that there are more client interactions in the southern region, so to increase sales in the North region, it is necessary to do the same.

Data Intelligence In The Real World

  • Healthcare: Rapid digitalization of healthcare systems are adopting technologies to create a connected healthcare environment. Hospitals need to synchronize with the technology to become smart, advance, and more accurate. Hospitals use various types of sensors, apps, digital equipment that are generating a large volume of data regularly. This data can be used to automate several processes such as administrative, treatment, and clinical processes. Data intelligence capabilities allow ML, AI, and Deep learning to make the healthcare processes more accurate, fast and help the practitioner handle the increasing number of cases and processes. These advanced technologies help to extract real-time intelligence and make decisions regarding the diagnosis process, prescribing medicines, hospital management, laboratory, patient care, etc., and leads to high operational efficiency and care delivery.
  • Supply chain management: Supply chain software generates and collects a vast amount of data. But they are not aware of how they can best use it to make their operations more effective. Data intelligence in the supply chain management network predicts business risk, minimizes loss, and makes automated self-learning supply chains. As a result, it drives real-time coordination and innovations.
  • Human Resource: Organizations are using HR software to manage internal HR functions such as payroll, employee benefits, recruitment, training, talent management, attendance management, employee engagement, etc., to enhance their features and capabilities. They always have to do many tasks to understand employees better, attract top talent, and initiate programs to retain them and analyze their performance. They have a lot of data generated from their HRMS(Human Resource Management System) software. Data Intelligence can help them analyze and understand the data, gather insights, and make a precise decision that can make their organization drive healthier and faster.
  • E-commerce: One of the success secrets of an e-commerce website is using customer reviews to know their experience, preference and then use them to make profitable decisions. Using ML and NLP techniques to interact with their customers and get data from them and use it to drive performance, improve Customer Engagement, Service Quality, Support Quality, and ultimately Sales. Data Intelligence makes it possible for them to accomplish these tasks, recommend products, understand customer preferences, solve their queries, improve quality and services, etc. Harnessing this information can give you a treasure trove of insights that can power your products and processes, improve customer experience, marketing, manage store operation, etc.

Conclusion

Akira AI is a data intelligence platform that provides intelligence using analysis and learning by processing data from various sources.

LF AI & Data Resources

Flyte 0.13.0 Release Now Available!

By Blog

Flyte, an LF AI & Data Foundation Incubation Project, has released version 0.13.0! Flyte is a Kubernetes native workflow automation platform for complex, mission-critical data and ML processes at scale. It allows users to describe their ML/Data pipelines using Python, Java, Scala, or in the future, other languages and manages the data flow, parallelization, scaling, and orchestration of these pipelines. 

In version 0.13.0, Flyte adds a variety of improvements. Highlights include:

Platform: 

  • Support for complete Oauth2 spec including SingleSignOn and configuration examples for popular IDP’s now available in Flyte. Please see the updated information and description of the feature, and the setup information.
  • Backend improvements to support dynamic workflow visualization and simpler debugging for Kubernetes and external errors (UI updates in  future releases).
  • flytectl: Cross-platform portable CLI for Flyte
  • Documentation site overhaul and redesign
  • Improved end to end platform performance

Flytekit:

  • Beta: Updated API to interact with past executions, launch new executions. This makes it possible to have simplified programmatic access for all Flyte features and perform regular data science tasks like retrieving previous results, compare executions, etc. 
  • Beta: Support for prebuilt container plugins with faster user interactivity
  • Plugin: Interact with any SQL database using SqlAlchemy
  • Plugin: Use Versioned datasets using DoltDB
  • Access secrets using standardized interaction pattern

To learn more about the Flyte 0.13.0 release, check out the full release notes. Want to get involved with Flyte? Be sure to join the Flyte-Announce and Flyte-Technical-Discuss mailing lists to join the community and stay connected on the latest updates. 

Congratulations to the Flyte team and we look forward to continued growth and success as part of the LF AI & Data Foundation! To learn about hosting an open source project with us, visit the LF AI & Data Foundation website.

Flyte Key Links

LF AI & Data Resources

RosaeNLG Joins LF AI & Data as New Sandbox Project

By Blog

LF AI & Data Foundation—the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), deep learning (DL), and Data open source projects, is announcing RosaeNLG joining the Foundation as its first Sandbox Project. 

The Sandbox stage was recently added by the LF AI & Data Technical Advisory Council (TAC) to accommodate early stage projects that meet one or more  of the following requirements:

  • Any project that intends to join LF AI & Data Incubation in the future and wishes to lay the foundations for that.
  • New projects that are designed to extend one or more LF AI & Data projects with functionality or interoperability libraries. 
  • Independent projects that fit the LF AI & Data mission and provide the potential for a novel approach to existing functional areas (or are an attempt to meet an unfulfilled need).

RosaeNLG is a great fit for this stage and was voted by the TAC into incubation at the Sandbox stage. It is an open source Natural Language Generation (NLG) project that aims to offer the same NLG features as product NLG solutions and to be developer and IT friendly for ease of integration and configuration. RosaeNLG was released and open sourced by Ludan Stoecklé, CTO of Data & AI Lab at BNP Paribas CIB and Expert Professor at aivancity school. 

Dr. Ibrahim Haddad, Executive Director of LF AI & Data, said: “RosaeNLG is a great foundational project that aims to broaden the accessibility and understandability of AI. We’re excited to welcome RosaeNLG as our first Sandbox stage project and look forward to supporting its journey for increased adoption, growth, and collaboration with other projects.”  

Template-based Natural Language Generation (NLG) automates the production of relatively repetitive texts based on structured input data and textual templates, run by an NLG engine. Production usage is widespread in large corporations, especially in the financial industry.

Typical use cases are:

  • Describing a product based on its features for SEO purposes
  • Produce structured reports such as risk reports or fund performance in the financial industry
  • Generate well formed chatbot answers

RosaeNLG templates are developed on VSCode with a friendly syntax and are easy to integrate. It currently supports languages such as English, French, German, Italian, and Spanish, with linguistic resources. It also provides NLG on both the server-side (using node.js REST API) and the browser-side.

Ludan Stoecklé, the founder of RosaeNLG, said: “Non-expert users don’t understand long tables of figures and dashboards; they prefer simple textual explanations. NLG is key in the democratization and understandability of data in general and to trusted AI in particular. Template-based NLG is the only way to achieve complex data-to-text projects without any error or hallucination in the texts, which is mandatory for trust. The support of the LF AI & Data Foundation will foster adoption and community growth, as well as diversity in NLG domain, with the goal to support more than 50 commonly spoken languages.”

LF AI & Data supports projects via a wide range of services, and the first step is joining the Foundation in incubation. Learn more about RosaeNLG on their GitHub and be sure to join the RosaeNLG-Announce and RosaeNLG-Technical-Discuss mail lists to join the community and stay connected on the latest updates. 

A warm welcome to RosaeNLG! We look forward to the project’s continued growth and success as part of the LF AI & Data Foundation. To learn about how to host an open source project with us, visit the LF AI & Data website.

RosaeNLG Key Links

LF AI & Data Resources

ONNX 1.9 Release Now Available!

By Blog

ONNX, an LF AI & Data Foundation Graduated Project, has released version 1.9! ONNX is an open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. 

In version 1.9, ONNX adds a variety of improvements. Highlights include

  • Selective schema loading of specific operator set versions to reduce memory usage in runtimes
  • New and updated operators to support more data types and models including advanced object detection models like MobileNetV3, YOLOv5
  • Improved tools for splitting large, multi-GB models into separate files
  • More details and sample tests added to operator documentation
  • Version converter enhanced to make it easier to upgrade models to newer operator sets

To learn more about the ONNX 1.9 release, check out the full release notes. Want to get involved with ONNX? Be sure to join the ONNX Announce mailing list to join the community and stay connected on the latest updates. 

Congratulations to the ONNX team and we look forward to continued growth and success as part of the LF AI & Data Foundation! To learn about hosting an open source project with us, visit the LF AI & Data Foundation website.

ONNX Key Links

LF AI & Data Resources

New LF AI & Data Member Welcome – Q1 2021

By Blog

We are excited to welcome six new members to the LF AI & Data Foundation. VMware has joined as a General Member, and Galgotias University, High School Technology Services, OpenUK, Ken Kennedy Institute, and the University of Washington – Tacoma have joined as Associate Members. 

The LF AI & Data Foundation will build and support an open community and a growing ecosystem of open source AI, data and analytics projects, by accelerating development and innovation, enabling collaboration and the creation of new opportunities for all of the members of the community.

Learn more about the new organizations in their own words below:

General Members

The LF AI & Data General membership is targeted for organizations that want to put their organization in full view in support of LF AI & Data and our mission. Organizations that join at the General level are committed to using open source technology, helping LF AI & Data grow, voicing the opinions of their customers, and giving back to the community.

VMware streamlines the journey for organizations to become digital businesses that deliver better experiences to their customers and empower employees to do their best work. Our software spans App Modernization, Cloud, Networking & Security, and Digital Workspace.

Associate Members

The LF AI & Data Associate membership is reserved for pre-approved non-profits, open source projects, and government entities. 

Galgotias University is devoted to excellence in teaching, research and innovation, and to develop leaders who’ll make a difference to the world. The University, which is based in Greater Noida, has an enrollment of over 15,000 students across more than 100 Undergraduate and Postgraduate programs.


High School Technology Services strives to provide the highest quality information technology services to high schools, teenagers, and adults. From creating websites to building computer labs, from offering counseling sessions to designing preparation programs, we aim to provide an assortment of services to support the dreams of these teenagers, their parents, and the schools.

 

OpenUK advocates for Open Technology being open source software, open source hardware, and open data, “Open” in and for business communities across the UK. As an industry advocacy organization, OpenUK gives its participants greater influence than they could ever achieve alone. Collaboration is central to everything we do and we use it to bring together business, public sector, and community in the UK to collaborate locally and globally.

 

The Ken Kennedy Institute is the virtual home of over two hundred faculty members and senior researchers at Rice University spanning computer science, mathematics, statistics, engineering, natural sciences, humanities, social sciences, business, architecture, and music.

The Institute brings together the Rice community to collaboratively solve critical global challenges by fostering innovations in computing and harnessing the transformative power of data. We enable new conversations, drive interdisciplinary research in AI and data science, develop new technology to serve society, advance the current and future workforce, promote an ecosystem of innovation and entrepreneurship, and develop academic, industry, and community partnerships in the computational sciences.

 

University of Washington – Tacoma is an urban-serving university providing access to students in a way that transforms families and communities. We impact and inform economic development through community-engaged students and faculty. We conduct research that is of direct use to our community and region. And, most importantly, we seek to be connected to our community’s needs and aspirations.

Welcome New Members!

We look forward to partnering with these new LF AI & Data Foundation members to help support open source innovation and projects within the artificial intelligence (AI), machine learning (ML), deep learning (DL), and data space. Welcome to our new members!

Interested in joining the LF AI & Data community as a member? Learn more here and email membership@lfaidata.foundation for more information and/or questions. 

LF AI & Data Resources

Egeria 2.8 Release Now Available!

By Blog

Egeria, an LF AI & Data Foundation Graduate Project, has released version 2.8! Egeria is an open source project dedicated to making metadata open and automatically exchanged between tools and platforms, no matter which vendor they come from. 

In version 2.8, Egeria adds a variety of improvements. Highlights include:

  • New support for event and property filtering for the open metadata server security connector
    • The repository services support 3 filtering points for managing events for the OMRS Cohort Topic, however, the filtering points are set up in the configuration document of the server. This configuration provides no control to allow filtering of events for specific instances. Version 2.8 extends the metadata server security connector so it can be called at these same filter points.
    • The security server connector will have two new interfaces that it can implement: one for the cohort events and one for saving events to the local repository.
      • The event interface will have two methods, one for sending and one for receiving. The parameters will include the cohort name and the event contents. It can return the event unchanged, return a modified event (e.g. with sensitive content removed) or return null to say that the event is filtered out.
      • The event saving interface will receive the instance header and can return a boolean to indicate if the local repository should store it. If true is returned, the refresh event sequence is initiated. The repository connector then has the ultimate choice when the refreshed instance is returned from the home repository as to whether to store it or not.
  • Changes to metadata types
    • Updates to the location types in model 0025:
      • Add the mapProjection property to the FixedLocation classification
      • Change the address property to networkAddress in the CyberLocation classification
      • Deprecated HostLocation in favor of the AssetLocation relationship
    • Deprecate the RuntimeForProcess relationship since it is superfluous – use ServerAssetUse since Application is a SoftwareServerCapability.
    • Replace the deployedImplementationType property with the businessCapabilityType in the BusinessCapability since it is a more descriptive name.
  • New performance workbench for the CTS (technical preview)
    • The performance workbench intends to test the response time of all repository (metadata collection) methods for the technology under test. The volume of the test can be easily configured to also test scalability.
  • New interface for retrieving the complete history of a single metadata instance
    • Two new (optional) methods have been introduced to the metadata collection interface:
      • getEntityDetailHistory
      • getRelationshipHistory
    • Both methods take the GUID of the instance for which to retrieve history, an optional range of times between which to retrieve the historical versions (or if both are null to retrieve all historical versions), and a set of paging parameters.
    • If not implemented by a repository, these will simply throw FunctionNotSupported exceptions by default to indicate that they are not implemented.
  • Splitting of CTS results into multiple smaller files
    • Up to this release, the detailed results of a CTS run could only be retrieved by pulling a huge (100’s of MB) file across the REST interface for the CTS. Aside from not typically working with most REST clients (like Postman), this had the additional impact of a sudden huge hit on the JVM heap to serialize such a large JSON structure (immediately grabbing ~1GB of the heap). While this old interface still exists for backwards compatibility, the new default interface provided in this release allows users to pull down just an overall summary of the results separately from the full detailed results, and the detailed results are now broken down into separate files by profile and test case: each of which can therefore be retrieved individually.
  • Bug fixes and other updates
    • Additional Bug Fixes
    • Dependency Updates

To learn more about the Egeria 2.8 release, check out the full release notes. Want to get involved with Egeria? Be sure to join the Egeria-Announce and Egeria-Technical-Discuss mailing lists to join the community and stay connected on the latest updates. 

Congratulations to the Egeria team and we look forward to continued growth and success as part of the LF AI & Data Foundation! To learn about hosting an open source project with us, visit the LF AI & Data Foundation website.

Egeria Key Links

LF AI & Data Resources

SOAJS Joins LF AI & Data as New Incubation Project

By Blog

LF AI & Data Foundation, the organization building an ecosystem to sustain open source innovation in artificial intelligence (AI), machine learning (ML), deep learning (DL), and Data open source projects, today is announcing SOAJS as its latest Incubation Project. 

SOAJS is an open source microservices and API management platform. SOAJS simplifies and accelerates the adoption of multi-tenant microservices architecture by eliminating proliferation pain. The SOAJS platform empowers organizations to create and operate a microservices architecture capable of supporting any framework while providing API productization, multi-tenancy, multi-layer security, cataloging, awareness, and adaptable to existing source code; to automatically catalog and release software components with multi-tenant, multi-version and multi-platform capabilities. SOAJS integrates and orchestrates multiple infrastructures and technologies in a simplistic and secured approach while accelerating the release cycle with custom continuous integration & a smart continuous delivery pipeline. The SOAJS platform is capable of creating and managing custom environments per product, per department, per team, per resource, and per technology in a simplistic approach empowering every member of the organization.

SOAJS was released and open sourced by Herron Tech.

Dr. Ibrahim Haddad, Executive Director of LF AI & Data, said: “We are happy to welcome SOAJS to LF AI & Data and help it thrive in a neutral, vendor-free environment under an open governance model. With SOAJS, we’re able now to offer a Foundation project that offers a complete enterprise open source microservice management platform and already has ongoing collaborations with existing LF AI & Data projects such as Acumos. We look forward to tighter collaboration between SOAJS and all other projects to drive innovation in data, analytics, and AI open source technologies.” 

API Aware Pipeline can be a key contribution to LF AI ML Workflow and Interop Committee

DevOps automation is limited to infrastructure deployment and source code release addresses only a fraction of the challenges of managing the development, deployment, and operation of large numbers of APIs and Microservices in today’s complex environments. For teams to achieve the agility promised by modern application development, they need DevOps automation that is API and Microservice aware.

SOAJS delivers API-Aware DevOps, with rich API- and Microservice-optimized automation capabilities that enable high-performance, agile execution.

The SOAJS API-Aware pipeline can be a key contribution to LF AI’s ML Workflow and Interop Committee by helping multiple projects close the loop and take advantage of its Multi Environment Marketplace, Automated Cataloging, Smart Deployment, Multi Tenant Authentication/Authorization Gateway, and Middleware to standardize, release, deploy and operate ML models the way Acumos is using it today.

Antoine Hage, co-creator of SOAJS and co-founder of Herron Tech, the sponsor of SOAJS, said: “We are thrilled to join the LF AI & Data community and look forward to growing SOAJS while helping other projects solve the interoperability challenges.”

SOAJS is the only end-to-end microservices management platform to help transform and achieve durable agility by taking advantage of our complete and adaptable solution. No other competitor has our unique features and capabilities. 

Key features of the SOAJS platform include:

  • API Management and Marketplace
    • API Builder: passthrough and smart endpoint
    • API Framework: build microservice tenfold faster
    • Heterogeneous Catalog with source code integration and adaptation 
    • Complete pipelines management for APIs and resources
      • API: service, microservices, passthrough, smart endpoint
      • Daemons: cron jobs, interval, parallel
      • Resource: clusters, bridge to existing, binary
      • Front end: nginx, Multi domain, Automated SSL
      • Custom: custom applications, custom packages, monolithic
      • Recipe: deploy recipe, standardization 
    • Easily adapt to existing APIs & legacy systems
  • Multi Environment Orchestration and Deployment
    • Cloud orchestration & distributed architecture
    • Smart multi environment deploy with ledger, rollback and multi version support
    • Deploy from any source code and binary
    • Infra as Code native & 3rd party support
    • Container & VM orchestration
    • Import/export/clone environments
    • Custom CI/CD with Smart Release
    • Support multi continuous integration server
    • Support multi GIT server
  • Multi Tenant Authentication and Authorization Gateway
    • API productization and packaging
    • Multi environment and multi version support 
    • Automatic awareness and mesh among microservices
    • Multi tenant authentication and authorization with roaming
    • Registry and resource configuration and management
    • API monitoring and performance measurement
  • CLI and Single Pane of Glass Management Console
    • Full monitoring, high availability, analytic and control
    • Correlation between resources and traffic analytic vs errors in logs
    • User management with federation and access control
    • Notification with trackability ledger

LF AI & Data supports projects via a wide range of services, and the first step is joining as an Incubation Project.  LF AI & Data will support the neutral open governance for SOAJS to help foster the growth of the project. Check out the Get Started Guide and Demos to start working with SOAJS today. Learn more about SOAJS on their website and be sure to join the SOAJS-Announce and SOAJS-Technical-Discuss mail lists to join the community and stay connected on the latest updates. 

A warm welcome to SOAJS! We look forward to the project’s continued growth and success as part of the LF AI & Data Foundation. To learn about how to host an open source project with us, visit the LF AI & Data website.

SOAJS Key Links

LF AI & Data Resources

Join LF AI & Data at Kubernetes AI Day!

By Blog

The LF AI & Data Foundation is pleased to be a co-host at the upcoming Kubernetes AI Day! The event will be held virtually on May 4, 2021, and registration is only US$20.

Kubernetes is becoming a common substrate for AI that allows for workloads to be run either in the cloud or in its own data center, and to easily scale. This event is great for developers who are interested in deploying AI at scale using Kubernetes. 

The agenda is now live! Please note the times below are displayed in Pacific Daylight Time (PDT).

Tuesday, May 4, 2021

1:00 PDT

Opening Remarks

1:05 PDT

Scaling ML pipelines with KALE — the Kubeflow Automated Pipeline Engine

1:40 PDT

A K8s Based Reference Architecture for Streaming Inference in the Wild

2:15 PDT

Embrace DevOps Practices to ML Pipeline Development

2:45 PDT

Break

3:05 PDT

Taming the Beast: Managing the day 2 operational complexity of Kubeflow

3:40 PDT

The SAME Project: A Cloud Native Approach to Reproducible Machine Learning

4:10 PDT

Break

4:25 PDT

Stand up for ethical AI! How to detect and mitigate AI bias using Kubeflow

5:00 PDT

The production daily life: An end to end experience of serverless machine learning, MLOps and models explainability

6:30 PDT

Closing Remarks

Visit the event website for more information about the schedule and speakers. Join us by registering to attend Kubernetes AI Day – Register Now!

The LF AI & Data Foundation’s mission is to build and support an open AI community, and drive open source innovation in the AI, ML, and DL domains by enabling collaboration and the creation of new opportunities for all the members of the community. 

Want to get involved with the LF AI & Data Foundation? Be sure to subscribe to our mailing lists to join the community and stay connected to the latest updates.

LF AI & Data Resources