Skip to content

A function that extracts GitHub repository search results. Project for Online Data Collection (oDCM) course @ Tilburg University.

Notifications You must be signed in to change notification settings

thtbui/github-repository-finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Github Repository Finder - with sample dataset

Introduction

This notebook gives information about the function used to fetch Github repositories based on a keyword and a determined timeframe starting from the day of fetching.

Running instruction

Requirements:

  1. Generate your own Github Token: Creating a personal access token
  2. Save your token as an enviroment variable, remember to name the variable as 'GITHUBTOKEN': Configuring Environment Variables
  3. Make sure you have installed the following packages in python: requests, math, datetime, dateutil, csv, pandas, json, os, time. Installation instruction can be found at Python website

Function structure:

GRF collects data by operating 4 separate steps accquired via 4 functions: find_repo, export_repo_list; save_column; save_dt. The working of these functions is illustrated in the following diagram:

Fig1 GitHub Repository Finder components

Sample dataset:

A sample dataset was obtained by using the following command:

grf("python", 3, 8)
import pandas as pd
pd.read_csv("data/dt.csv", delimiter= ";",nrows=10)
<style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; }
.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}
</style>
id name url language created stars watch forks readme
0 416977797 AI_Project https://github.com/pbl4team/AI_Project Python 2021-10-14T03:37:01Z 0 0 4 Project AI Systeam - Computer Vision with pyth...
1 416995331 python https://github.com/Cam0411/python Python 2021-10-14T05:06:00Z 0 0 0 python
2 416908634 Python https://github.com/psplendid61/Python NaN 2021-10-13T21:53:43Z 0 0 0 NaN
3 416963376 python https://github.com/iAMSe/python NaN 2021-10-14T02:28:55Z 0 0 0 python
4 416996896 python https://github.com/rakeshk67/python NaN 2021-10-14T05:13:42Z 0 0 0 python
5 416961346 python https://github.com/colddie/python NaN 2021-10-14T02:19:55Z 0 0 0 NaN
6 416990435 Python https://github.com/mahdidahmani/Python Python 2021-10-14T04:41:11Z 0 0 0 NaN
7 416952467 python https://github.com/grace-th3/python Python 2021-10-14T01:38:42Z 0 0 0 NaN
8 416928589 Python https://github.com/Cheung-man/Python NaN 2021-10-13T23:36:27Z 0 0 0 NaN
9 416935834 python https://github.com/mygithuang/python NaN 2021-10-14T00:15:25Z 0 0 0 python mygithuang

Function source code:

Source code and detailed function documentation are available at: GRF Source code

About

A function that extracts GitHub repository search results. Project for Online Data Collection (oDCM) course @ Tilburg University.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy