• Home
  • Explore

Influence of batch size of training convergence · Issue #25 · openai/CLIP

github.com/openai/CLIP/issues/25

1 Users

0 Comments

3 Highlights

0 Notes

Tags

Top Highlights

  • he pairwise combinations of all image-text feature interactions are then cheap to compute on top of this, because they can re-use these already computed embeddings and only involve calculating a single additional inner product of relatively low-dimensional vectors per pair.

  • We had to modify BCE (manual downweighting to prevent negative pairs from dominating the loss) or "warm up" from a small amount of negatives.

  • Other common alternatives in the literature like triplet loss,

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.

AboutPrivacyTerms

© 2023 Glasp Inc. All rights reserved.