COSCUP 2024

繁體中文 AI 開源實踐計畫成果分享
2024-08-03, 13:00–16:00 (Asia/Taipei), TR611

活動共筆:https://g0v.hackmd.io/@jothon/AI_Grant_20240803

繁體中文 AI 開源實踐計畫的目標在於促進具有高度透明度(Transparency)、重用性(Reusability)與永續性(Long-term Impact)的 Gen AI 專案,鑑於 AI 模型日新月異,本計畫著重將資源挹注於高品質開源資料集(High-quality open dataset)的整理蒐集工作,讓成果可以對現在與未來繁體中文模型的建構都能有所貢獻。計畫關注於建構語言模型訓練之繁體中文文本資料、Benchmark 台灣觀點的評測,並鼓勵如台語相關團隊投入。本次議程將由本計畫合作的各個參與團隊分享各團隊的開源成果與執行經驗,成果將陸續開源於 Hugging Face 平台,六組團隊如下:

➀ 台灣語言模型競技場 Taiwan Chatbot Arena
➁ LegaL-Mind:智慧法律諮詢系統
➂ 大量閱讀台灣研究的健康促進小幫手
➃ 建置定期更新的立委發言觀測儀表板與政治時事資料集
➄ 台灣AI教學共創實驗室
➅ 台語自動分詞與詞性標記系統

繁體中文 AI 開源實踐計畫與合作團隊介紹、開源成果網址:
https://g0v.hackmd.io/@jothon/AI_Grant_20240803


繁體中文 AI 開源實踐計畫,籌辦單位:
.聯絡我們:jothon-organizers@g0v.tw
.【主辦單位】g0v 揪松團 (https://jothon.g0v.tw/about/) 、零時小學校 (https://sch001.g0v.tw/)
.【贊助單位】Brighter Capital (https://brightercapital.com/)
.【合辦單位】財團法人開拓文教基金會 (https://www.frontier.org.tw/blog2/) 、財團法人開放文化基金會 (https://ocf.tw/) 、Taiwan National Treasure Foundation (https://www.nationaltreasure.tw/en)

https://jothon.g0v.tw/

g0v 零時政府揪松團是 g0v 社群籌辦雙月大黑客松(大松)和基礎松,以及推動募款事務的工作小組,目前有七位志工和二位職工。2012 年開始協助社群籌辦黑客松,2014 年正式組成「揪松團」(jothon),2016 年起,啟動社群基礎建設計畫,開辦「基礎松(infrathon)」,在大黑客松之餘,推動更順暢的線上/線下跨界協作,並於同年底推出「g0v 公民科技創新獎助金(Civic Tech Prototype Grant)」 ,鼓勵 g0v 專案持續投入開發和長期維護、營運。2020 年開辦「零時小學校(Sch001)」,與教育、開源社群一起從零重新思考學校的角色。2024 年執行「繁體中文 AI 開源實踐計畫」,鼓勵民間團隊實踐在地化語言模型的相關工作。

The g0v Jothon is responsible for organizing bi-monthly hackathons, infrathons and promoting fundraising activities. Currently, the team consists of seven volunteers and two staff members.Jothon began as a task force assisting the community in organizing hackathons in 2012 and was formally named Jothon in 2014. In 2016, Jothon initiated the Community Infrastructure Project and launched a series of “Infrathons” to promote smoother online/offline collaboration alongside regular hackathons. In the same year, Jothon introduced the “g0v Civic Tech Prototype Grant” to encourage continuous development and long-term maintenance and operation of g0v projects. In 2020, Jothon launched “Sch001” to rethink the role of schools from scratch together with the education and open-source communities. In 2024, Jothon executed the “Traditional Chinese AI Open Source Practice Project” to encourage civil teams to work on localized language model-related tasks.

This speaker also appears in:

專案簡介:https://sch001.g0v.tw/dash/prj/Psgw1_h15KNJoFo55nCCo4GTTi_Q7C
Hugging Face:https://huggingface.co/datasets/aigrant/tw_chatbot_arena

專案簡介:https://sch001.g0v.tw/dash/prj/PscU0Ax3sXd6bCUw57AB6Tybr4BlnR
Hugging Face:https://huggingface.co/datasets/aigrant/Legal-Mind-Mix-160K

專案簡介:https://sch001.g0v.tw/dash/prj/PqYu6bC3rc.Ii6Qc5h99T3JtbtQn2o
Hugging Face:https://huggingface.co/datasets/aigrant/medical_health

專案簡介:https://sch001.g0v.tw/dash/prj/PuH4T8g4v2yywCP85Wc9MluRFz_HCh
Hugging Face:https://huggingface.co/datasets/aigrant/taiwan-legislator-transcript
Hugging Face:https://huggingface.co/datasets/aigrant/taiwan-ly-law-research

專案簡介:https://sch001.g0v.tw/dash/prj/PwDWHhZ3DFGZfDP55_uBm3R_T3ypcr
Hugging Face:https://huggingface.co/datasets/gatelynch/awesome-taiwan-knowledge

專案簡介:https://sch001.g0v.tw/dash/prj/PwBWl.O3AIxboDff5pXCq.DBAx1Eza
Hugging Face:https://huggingface.co/datasets/aigrant/Taiwanese-Chinese_characters-POJ-Collection