Showing 1–2 of 2 results for author: Jaghouar, S

Search v0.5.6 released 2020-02-24

arXiv:2407.07852 [pdf, other]

cs.LG cs.DC

OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training

Authors: Sami Jaghouar, Jack Min Ong, Johannes Hagemann

Abstract: OpenDiLoCo is an open-source implementation and replication of the Distributed Low-Communication (DiLoCo) training method for large language models. We provide a reproducible implementation of the DiLoCo experiments, offering it within a scalable, decentralized training framework using the Hivemind library. We demonstrate its effectiveness by training a model across two continents and three countr… ▽ More OpenDiLoCo is an open-source implementation and replication of the Distributed Low-Communication (DiLoCo) training method for large language models. We provide a reproducible implementation of the DiLoCo experiments, offering it within a scalable, decentralized training framework using the Hivemind library. We demonstrate its effectiveness by training a model across two continents and three countries, while maintaining 90-95% compute utilization. Additionally, we conduct ablations studies focusing on the algorithm's compute efficiency, scalability in the number of workers and show that its gradients can be all-reduced using FP16 without any performance degradation. Furthermore, we scale OpenDiLoCo to 3x the size of the original work, demonstrating its effectiveness for billion parameter models. △ Less

Submitted 10 July, 2024; originally announced July 2024.
arXiv:2111.14426 [pdf, other]

cs.LG cs.CV

doi 10.1007/978-3-031-16788-1_36

Improving traffic sign recognition by active search

Authors: S. Jaghouar, H. Gustafsson, B. Mehlig, E. Werner, N. Gustafsson

Abstract: We describe an iterative active-learning algorithm to recognise rare traffic signs. A standard ResNet is trained on a training set containing only a single sample of the rare class. We demonstrate that by sorting the samples of a large, unlabeled set by the estimated probability of belonging to the rare class, we can efficiently identify samples from the rare class. This works despite the fact tha… ▽ More We describe an iterative active-learning algorithm to recognise rare traffic signs. A standard ResNet is trained on a training set containing only a single sample of the rare class. We demonstrate that by sorting the samples of a large, unlabeled set by the estimated probability of belonging to the rare class, we can efficiently identify samples from the rare class. This works despite the fact that this estimated probability is usually quite low. A reliable active-learning loop is obtained by labeling these candidate samples, including them in the training set, and iterating the procedure. Further, we show that we get similar results starting from a single synthetic sample. Our results are important as they indicate a straightforward way of improving traffic-sign recognition for automated driving systems. In addition, they show that we can make use of the information hidden in low confidence outputs, which is usually ignored. △ Less

Submitted 29 November, 2021; originally announced November 2021.

Comments: 6 pages, 7 Figures

Journal ref: DAGM GCPR 2022 Pattern Recognition pp. 594--606 (2022)

Search v0.5.6 released 2020-02-24