inner-banner-bg

Advances in Machine Learning & Artificial Intelligence(AMLAI)

ISSN: 2769-545X | DOI: 10.33140/AMLAI

Impact Factor: 1.3

Exploring the Integration of Machine Learning Models in Programming Languages on GitHub: Impact on Compatibility to Address Them

Abstract

Faten Slama, Imen Ismail and Lassaad Latrach

GitHub repositories are often used for collaborative development, allowing multiple developers to work on the same codebase and contribute their changes. Each repository is typically associated with a specific project, and it can contain everything from code files to documentation, bug reports, and feature requests. Depending on the context, it can contain files, directories, other resources related to a project, and it is often associated with a particular programming language. By default, GitHub automatically detects the primary programming language used in a repository based on the file extensions and content within the repository. However, this detection is not true all the time; there are some potential issues to consider. One of these problems is that the detected language may not accurately reflect the actual programming languages used in the project, especially if the project utilizes multiple programming languages or has undergone language migrations. In this study, we apply an alternative technology to resolve problems with classifying the programming language of a GitHub repository by analysing file extensions and identifying all programming languages used in the project. We also determine the appropriate primary programming language for the repository. This paper investigates how this technology can address the issues surrounding GitHub’s automatic detection of a repository’s primary programming language and how it can provide information on all the programming languages used in a project.

PDF