Thoughts & Ideas
Rwema D. Bagirishya 10 Years in Tech IndustryManifesto - ENGLISH
English | Here 2024 - 2029Manifesto - KINYARWANDA
Ikinyarwanda | Hano 2024 - 2029Rwanda or Nowhere
Rwanda, Kigali 5-year Party AgendaUnlocking the power of neural networks revolutionizes translation for unpopular languages, preserving diversity and connecting the world.
GitHub, the world’s most widely adopted AI developer tool, has revolutionized how developers collaborate and manage their projects.
"First, design the algorithm. Then, revolutionize the world with AI"
The underlying disparities within artificial intelligence (AI) have the potential to influence global health as AI technology becomes increasingly embedded in various sectors. For instance, when I tasked with producing images of a Black African physician caring for a sick white child, a sophisticated AI model encountered significant challenges. These difficulties arise because these algorithms are trained on datasets imbued with societal biases, including racial prejudices rooted in the colonial era that persist in global health. There is a growing movement advocating for the "decolonization of global health," which emphasizes the vast power imbalance between those most impacted by health disparities and those who set the research agendas, establish standards, disseminate findings, control data, and lead the field. Presently, over 80% of global health leaders hail from wealthy nations. This imbalance is further perpetuated by leading global health funding organizations, such as the U.S. Agency for International Development (USAID) and the Wellcome Trust, which predominantly distribute grants to researchers and institutions within their own or other high-income countries.
To address global health inequities, it is crucial to invest in AI training and resources within lower-income nations, enabling local experts to develop and manage AI solutions tailored to their specific health challenges. This involves ensuring that local voices are prioritized in decision-making processes and fostering international collaborations that emphasize the leadership of those from affected regions. Funding should be directed toward local researchers and institutions, empowering them to lead projects that directly affect their communities. Implementing rigorous ethical guidelines and standards in AI development can help prevent the reinforcement of existing biases. Also a coordinated effort to balance power and resource distribution is essential to fully leverage AI's potential in advancing global health equity.
Image: Biased AI modelIn Africa, a burgeoning movement aims to achieve these goals through the use of AI. Grassroots efforts have resulted in the development of programs and communities dedicated to leveraging AI to tackle local issues. These initiatives are most likely to succeed when spearheaded by Africans, who are the true experts on their own communities and challenges.
Data Science Africa (DSA), a grassroots organization founded by Africans for Africans, is dedicated to building local capacity and creating solutions to African problems through data science, including machine learning and AI. Since its inception in 2015, DSA has collaborated annually with researchers at a local university in various African cities to host a five-day event. This event comprises a three-day summer school followed by a two-day workshop, fostering education and innovation in data science across the continent. This gathering offers a venue for participants to present their data science innovations while building connections and a strong community network. DSA receives financial support from corporations and philanthropic initiatives such as Arm, DeepMind, Lacuna Fund, and AI4D Africa. These contributions fund the annual conference, provide research grants, and offer fellowships to African data scientists. Furthermore, DSA partners with similar efforts, including Deep Learning Indaba, which is the annual convention for the African machine learning and AI community. DSA exemplifies values that could greatly enhance the global health sector, such as fostering community, encouraging collaboration, and recognizing the importance of African expertise. The health-related AI and data science projects developed by DSA researchers often incorporate extensive input and engagement from local community members, ensuring a comprehensive and inclusive data science approach.
As an African AI enthusiast, I am dedicated to addressing global health inequities by developing AI solutions that tackle specific health challenges in lower-income nations. For instance, three other AI engineers from Rwanda and I have been working on creating predictive models to identify outbreaks of infectious diseases, using local data to ensure accuracy and relevance. We will first prototype this model in February 2025 at Ubumwe Grande Hotel. I believe many other innovative ideas exist on our continent, but due to the lack of funding for local researchers, they often become useless, which is why I believe balancing resource distribution is essential to fully leverage our potential in advancing global health equity.
During my first months at the WHO HQ in Geneva, Switzerland, I met a Kenyan national who leads the Clinton Health Access Initiative’s (CHAI) digital health work. He showed me their project, which records patient data in Kenya, and it inspired me to start researching and developing an AI app that can analyze healthcare records and patient feedback in local languages, improving diagnostic accuracy and patient care. My participation in the AI for Good event at the United Nations has significantly enriched my approach, providing insights and connections that help balance power and resource distribution. This experience has reinforced my commitment to leveraging AI to advance global health equity and make a tangible impact in my community.
End-to-end community involvement means that projects are initiated and concluded with active participation from local communities. During my first months at the WHO HQ in Geneva, Switzerland, I met a Kenyan national who leads the Clinton Health Access Initiative’s (CHAI) digital health work. He showed me their project, which records patient data in Kenya, and it inspired me to start researching and developing an AI app that can analyze healthcare records and patient feedback in local languages, improving diagnostic accuracy and patient care. My participation in the AI for Good event at the United Nations has significantly enriched my approach, providing insights and connections, and I believe events like this that engage locals should increase and be hosted on our continent
Another notable example is the development of an end-to-end air quality management system called AirQo by researchers at Makerere University in Uganda. This system, designed to combat air pollution—a significant environmental health risk—was launched in 2015 and first implemented in Kampala in 2018. By 2022, it had expanded to eight additional African cities outside of Uganda. The World Health Organization reports that in 2019, air pollution contributed to 4.2 million premature deaths globally and 383,419 deaths in Africa.
The development of AirQo extended beyond technical data collection and data science methods to include comprehensive community engagement. The AirQo team created user-friendly platforms, such as web and mobile applications, to disseminate data to the public and policymakers. They also organized community engagement sessions in local languages at schools and community halls to discuss specific causes of air pollution, like motor vehicles.
One initiative involved installing air pollution monitors on motorcycle taxis to collect air quality data across a wider area of the city. The project also enlisted mechanics as ambassadors for better air quality. These monitors were specifically designed to operate effectively in typical African weather conditions. By incorporating community engagement and input, these end-to-end systems effectively address issues unique to the local community.
Recognizing that AI alone cannot address the profound issues of resource and power imbalances underlying health disparities in African nations is crucial. AI is merely one tool among many, but when applied by skilled professionals with the necessary resources, it can contribute to solving health-related challenges. Involving communities in identifying and tackling local problems significantly enhances the chances of gaining their support for these solutions. Rather than imposing pre-set agendas and programs on communities, this method empowers them to create their own solutions, leading to greater ownership and sustainability. By prioritizing local engagement and expertise, we can ensure that AI-driven health interventions are not only effective but also culturally appropriate and enduring.
Addressing global health inequities through AI requires a multifaceted approach that prioritizes local expertise and community engagement. The efforts of organizations like Data Science Africa and the development of innovative projects such as AirQo demonstrate the potential for AI to make significant strides in improving health outcomes when driven by those who understand their communities' unique challenges. These initiatives underscore the importance of shifting the power dynamics in global health, ensuring that those most affected by health disparities have a leading role in developing and implementing solutions.
In conclusion, the success of AI in advancing global health equity hinges on a concerted effort to balance power and resource distribution. By investing in local researchers and fostering international collaborations that emphasize local leadership, more effective, culturally appropriate, and sustainable health interventions can be created. It is imperative to support grassroots movements and community-led projects, which are more likely to yield lasting impacts.
Sunday, 10th. March. 2024
The rapid advancement of technology has revolutionized many fields, with neural networks (NNs) at the forefront of these developments. One area where NNs have made a significant impact is in machine translation (MT). While much attention has been given to popular languages, the potential of NNs in translating less commonly spoken or "unpopular" languages holds great promise. This article explores the intricacies of neural networks in machine translation, emphasizing their benefits for underrepresented languages.
Neural networks are a subset of artificial intelligence (AI) that mimic the human brain's interconnected neuron structure. They excel in recognizing patterns and making decisions based on large datasets. In the context of machine translation, neural networks can learn linguistic patterns from massive amounts of bilingual text data, enabling them to translate text from one language to another with increasing accuracy. The development of neural machine translation (NMT) has been a game-changer. Unlike traditional statistical machine translation (SMT) methods, which rely on hand-crafted rules and statistical models, NMT leverages deep learning. This allows for more fluid, natural translations that better capture the nuances of languages.
Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks were among the first neural architectures applied to machine translation. Their ability to handle sequential data makes them suitable for language processing tasks.
However, RNNs often struggle with long-term dependencies. A typical LSTM-based encoder-decoder model includes an input sequence processed through embedding and LSTM layers, with the output sequence generated by a dense layer. For instance, using TensorFlow, an LSTM model can be defined and compiled with a few lines of code ->:
Transformers have revolutionized NMT with their self-attention mechanisms, allowing for better handling of long-range dependencies and parallelization. For example, using the MarianMT model from the Hugging Face library, we can load a pre-trained model and tokenizer to translate text effectively.
Data Augmentation Techniques
For unpopular languages, data scarcity is a critical issue. Limited datasets mean that neural networks do not have enough material to learn from, which affects the quality of translations. Data augmentation techniques can help generate synthetic data to improve model training. One effective method I have found particularly useful is back-translation. This technique involves translating a large corpus from the target language to a high-resource language and then translating it back to the target language. By doing this, we can create additional data that closely resembles natural language usage in the target language.
Back-translation has shown to be highly effective in my experience. For instance, I once worked on improving the translation quality for a low-resource language by leveraging back-translation. By using MarianMT models, the process of generating synthetic data was automated, enhancing the dataset without requiring extensive manual intervention.
Additionally, I have found that using language models like GPT-3 can generate synthetic training data that is both diverse and contextually relevant. For instance, GPT-2, with its coherent text generation capability, can create realistic text snippets given a prompt. This approach has allowed me to enrich the training datasets for several low-resource languages, thereby improving the robustness and accuracy of the translation models.
Transfer Learning
In my work, I have often leveraged transfer learning to address the limitations of low-resource languages. Transfer learning involves using pre-trained models on high-resource languages and fine-tuning them on low-resource languages. This technique can significantly boost translation quality by transferring knowledge from one domain to another. Fine-tuning a pre-trained model, such as BERT, on the target language data, has proven to be effective. The pre-trained model already understands a lot about language structure and semantics, which can be adapted to the specifics of the target language with relatively small amounts of additional data.
One of my apps, developed using Flutter and Dart, involved fine-tuning a pre-trained multilingual BERT model to translate medical texts for a low-resource language. The process began by loading the Hugging Face transformers library in Python, utilizing the BertTokenizer and BertForSequenceClassification classes. I then collected a dataset of medical texts and created a custom dataset loader using PyTorch's Dataset class. The pre-trained BERT model was fine-tuned with this dataset using the AdamW optimizer and a learning rate scheduler to adjust the learning rate dynamically during training. The fine-tuning involved freezing the initial layers of BERT to retain the general language understanding and training the final layers on the specific medical texts. This approach allowed the model to adapt to the specific vocabulary and linguistic nuances of the medical domain, achieving high-quality translations that would be unattainable with the limited available data for the target language. The fine-tuned model was then exported and integrated into the Flutter app using a REST API built with Flask, enabling real-time translation of medical texts within the app's user interface.
Conclusion
Wednesday, 17th. March. 2024
GitHub Copilot, heralded as the world’s most widely adopted AI developer tool, has transformed the landscape of software development since its introduction. Developed by GitHub in collaboration with OpenAI, Copilot leverages advanced machine learning models to assist developers by suggesting code snippets, completing functions, and even writing entire blocks of code. This article delves into the features, benefits, and impact of GitHub Copilot on the development community.
The Genesis of GitHub Copilot
GitHub Copilot emerged from a collaboration between GitHub and OpenAI, leveraging the latter's Codex model, which is an extension of the GPT-3 language model. The goal was to create an AI that could seamlessly integrate with popular Integrated Development Environments (IDEs) to enhance developer productivity.
Launched in June 2021, GitHub Copilot quickly garnered attention for its ability to understand and generate code in various programming languages, including Python, JavaScript, TypeScript, Ruby, and Go, among others.
How GitHub Copilot Works
GitHub Copilot operates as an AI-powered code completion tool. It integrates into IDEs, providing real-time suggestions as developers type. By analyzing the context of the code being written, Copilot can suggest entire lines or blocks of code, function definitions, and even complex algorithms. It uses contextual cues from comments and previously written code to generate accurate and relevant suggestions.
Contextual Understanding
One of Copilot's standout features is its contextual understanding. It doesn't just suggest code based on syntax but also understands the intent behind the code. For example, if a developer writes a comment describing a function, Copilot can generate the corresponding function code. This capability significantly reduces the time spent on boilerplate code and allows developers to focus on more complex aspects of their projects.
Multi-Language Support
GitHub Copilot supports a wide range of programming languages. While it excels in popular languages like Python and Java, it also provides valuable assistance in less common languages. This versatility makes it a valuable tool for developers working on diverse projects, from web development to data science and beyond.
Benefits of The Copilot
Increased Productivity
By automating repetitive coding tasks, Copilot allows developers to write code faster and with fewer errors. According to a GitHub survey, developers using Copilot report a 55% increase in coding speed.
Enhanced Learning and Skill Development
For novice developers, Copilot serves as a valuable learning tool. By providing code suggestions and examples, it helps beginners understand best practices and coding patterns. This hands-on learning approach accelerates the acquisition of coding skills and boosts confidence among new developers.
Streamlined Code Reviews
Copilot’s ability to generate high-quality code also simplifies the code review process. By reducing the number of errors and improving code consistency, it makes it easier for teams to review and merge code changes. This results in more efficient collaboration and higher-quality software.
GitHub Copilot and the Open Source Community
GitHub Copilot has had a significant impact on the open source community, transforming how developers contribute to and maintain open source projects. According to GitHub's own data, over 30% of newly written code on GitHub is now assisted by Copilot, highlighting its widespread adoption and utility. By providing intelligent code suggestions and automating repetitive tasks, Copilot enables developers to focus on more innovative and complex aspects of their work. This boost in productivity can lead to faster development cycles and more frequent contributions to open source repositories. For instance, a survey conducted by GitHub found that 60% of developers reported an increase in their contribution frequency due to Copilot's assistance. Moreover, Copilot's ability to assist with multiple programming languages and frameworks makes it a versatile tool for open source developers working on diverse projects. However, the integration of AI-generated code also raises important considerations regarding code quality, licensing, and intellectual property within the open source ecosystem. Ensuring that Copilot-generated code adheres to the principles of open source licensing and maintaining transparency in AI-assisted contributions will be essential for fostering a collaborative and ethical development environment
The future of GitHub Copilot looks promising as it continues to evolve and improve. GitHub and OpenAI are actively working on refining the underlying models to enhance the accuracy and relevance of code suggestions, making the tool even more intuitive and capable. Additionally, there are plans to expand Copilot's integration with other development tools and platforms, further streamlining the development workflow. As the adoption of AI in software development grows, addressing ethical concerns will be paramount, and GitHub and OpenAI are committed to developing guidelines and best practices to ensure that Copilot's use is responsible and beneficial for the entire development community. This ongoing evolution promises to solidify Copilot's role in shaping the future of software development, offering new levels of productivity and innovation.
Sunday, 6th. November. 2022
/37
/37