Guest Author: Shirley Su, University of Southern California (USC) freshman majoring in Computer Science. Her passion is to utilize technology for social good: whether it is creating usable applications or advancing research, she wants to be a positive contributor for the community.
The Coronavirus pandemic was a major impediment for job opportunities as it prompted several internships to be cancelled this year. While disappointed, I still wanted to have a productive summer and further my experience in the computer science field. I decided to reach out to my former Microsoft AI Platform intern co-workers from last summer. We were eager to contact our former mentor and continue our projects on ONNX – this time, serving as open-source contributors.
As an exploration of the recent advancements in computer vision, I was eager to research new machine learning models and contribute them to the ONNX Model Zoo. The ONNX Model Zoo provides many state-of-the-art pre-trained models that come with instructions for integrating into applications. I investigated the tensorflow-onnx GitHub repo, which detailed the conversion of both the EfficientNet-Lite4, an image classification model, and the SSD-MobileNetV1, an object detection model. These are popular computer vision models and I wanted to add them to the ONNX Model Zoo so others could more readily use them.
I began the conversion process by initially copying and then running the Jupyter Notebook Script from the GitHub repo. This process included setting up environmental variables and downloading the pre-trained model. After saving the model, I ran the script to convert the model from TensorFlow into the ONNX format. I then ran inference on the saved model using ONNX Runtime to view and validate results. I also uploaded the models to Netron—an open source viewer that allows users to visualize the neural networks inside ONNX models. It provides information on the model’s operator set, the ONNX version, and input and output tensor shapes and types. I included that information in the template for the new model’s README as part of the instructions on how to pass in input data and how to interpret the output.
Comments from the ONNX community on GitHub were especially helpful in pointing out mistakes and helping me resolve the issues in the model folders. In both my EfficientNet-Lite4 and SSD-MobileNetV1 model, my file with the sample inputs and outputs were incorrect. To revise and fix the code, I converted the NumPy array to a TensorProto and saved the TensorProto as a serialized protobuf file. Moreover, there was an error in the postprocessing code for the SSD-MobileNetV1, which incorrectly outputted the array of object detection predictions. I realized that while the model produced detection classes from the inference, the most accurate class label was not being outputted. To fix this issue, I changed how the results were looped over to include the most accurate class label.
Trials and Tribulations
Another model that interested me was the Visual Question Answering (VQA) model, in which users input an image and a question about the image and the model outputs an answer. I used a VQA GitHub that had the necessary files and an open-source license.
However, I ran into several issues during the process. The most time-consuming and tedious task was downloading 64 GB worth of MS COCO data onto my computer—all without a fast internet connection or a powerful machine. This process took several hours and my computer crashed. Realizing that this attempt was futile, I began to look into Microsoft’s Azure Virtual Machines, which had the necessary memory and space needed. Using the virtual machine expedited the task and shorted the download time from approximately 10 hours to just 1 hour.
After I had successfully downloaded and preprocessed the data, the next obstacle was exporting the model to ONNX. When I passed in the standard model arguments, I was getting issues with PyTorch’s ONNX export call. Since the model was written with a much older version of PyTorch, I suspected the model needed updates to make it compatible with TorchScript and ONNX export.
I hope others in the open-source community will continue working on making this model accessible and contribute it to the model zoo.
Throughout this project, I gained more expertise working with Git and became more comfortable making pull requests and becoming an open-source contributor. I was able to contribute 2 important computer vision models to the ONNX model zoo. And while I was unable to contribute the VQA model, that project allowed me to gain hands-on experience with working with Azure virtual machines—a tool that is crucial when handling large amounts of data. I also became more comfortable reading others’ Python code and learning how to efficiently debug errors.
This summer, our team was especially grateful for the help we received from our mentors Vinitra Swamy and Natalie Kershaw. They have given us opportunities that allow us to make meaningful contributions to the ONNX Model Zoo. From dedicating time for weekly meetings with us to helping us debug strenuous errors, their guidance has been immensely helpful in our technical and professional development.
LF AI & Data Key Links