huggingface · burtenshaw · Oct 1, 2025
diff --git a/chapters/bn/chapter1/audio_data.mdx b/chapters/bn/chapter1/audio_data.mdx
@@ -167,9 +167,8 @@ DFT এর আউটপুট হল জটিল সংখ্যার এক
 আপনি amplitude মানগুলিকে ডেসিবেল স্কেলে রূপান্তর করতে `librosa.amplitude_to_db()` ব্যবহার করতে পারেন, এটি বর্ণালী মধ্যে সূক্ষ্ম বিবরণ টিকে দেখতে সহজ করে তোলে। কখনও কখনও লোকেরা **power বর্ণালী** ব্যবহার করে, যা amplitude এর পরিবর্তে শক্তি পরিমাপ করে;
 যা কেবলমাত্র amplitude এর মান এর বর্গাকার।
 
-<Tip>
-💡 বাস্তবে, লোকেরা FFT শব্দটি DFT-এর সাথে বিনিময়যোগ্যভাবে ব্যবহার করে, কারণ FFT বা Fast Fourier Transform হলো কম্পিউটারে DFT গণনা করার একমাত্র কার্যকরী  উপায়।
-</Tip>
+> [!TIP]
+> 💡 বাস্তবে, লোকেরা FFT শব্দটি DFT-এর সাথে বিনিময়যোগ্যভাবে ব্যবহার করে, কারণ FFT বা Fast Fourier Transform হলো কম্পিউটারে DFT গণনা করার একমাত্র কার্যকরী  উপায়।
 
 একটি অডিও সিগন্যালের ফ্রিকোয়েন্সি বর্ণালীতে তার তরঙ্গরূপের মতো একই তথ্য থাকে - তারা কেবল দুটি ভিন্ন উপায়, একই ডেটাকে দেখার। যেখানে তরঙ্গরূপ amplitude প্লট করে
 সময়ের সাথে সাথে অডিও সিগন্যালের এবং বর্ণালী নির্দিষ্ট সময়ে পৃথক ফ্রিকোয়েন্সির amplitude কল্পনা করে।
@@ -252,12 +251,11 @@ plt.colorbar()
 কারণ ডেসিবেলে রূপান্তর করার জন্যে লগারিদমিক অপারেশন প্রয়োগ করতে হয়। উপরের উদাহরণটি `librosa.power_to_db()` ব্যবহার করে `librosa.feature.melspectrogram()`
 একটি power spectrogram তৈরি করে।
 
-<Tip>
-💡 সব mel spectrogram একই নয়! সাধারণ ব্যবহারে দুটি ভিন্ন মেল স্কেল আছে ("htk" এবং "slaney"),
-এবং power spectrogram এর পরিবর্তে amplitude spectrogram ব্যবহার করা যেতে পারে। log-mel spectrogram গণনা করার সময় সর্বদা সত্য ডেসিবেল
-গণনা করা হয় না, কিন্তু সহজভাবে `লগ` নেওয়া হতে পারে। অতএব, যদি একটি মেশিন লার্নিং মডেল ইনপুট হিসাবে একটি mel spectrogram আশা করে,
-আপনি একই ভাবে কম্পিউট করছেন তা নিশ্চিত করতে দুবার চেক করুন।
-</Tip>
+> [!TIP]
+> 💡 সব mel spectrogram একই নয়! সাধারণ ব্যবহারে দুটি ভিন্ন মেল স্কেল আছে ("htk" এবং "slaney"),
+> এবং power spectrogram এর পরিবর্তে amplitude spectrogram ব্যবহার করা যেতে পারে। log-mel spectrogram গণনা করার সময় সর্বদা সত্য ডেসিবেল
+> গণনা করা হয় না, কিন্তু সহজভাবে `লগ` নেওয়া হতে পারে। অতএব, যদি একটি মেশিন লার্নিং মডেল ইনপুট হিসাবে একটি mel spectrogram আশা করে,
+> আপনি একই ভাবে কম্পিউট করছেন তা নিশ্চিত করতে দুবার চেক করুন।
 
 একটি mel spectrogram তৈরি করা একটি ক্ষতিকর অপারেশন কারণ এতে সিগন্যাল ফিল্টার করা জড়িত। একটি mel spectrogram কে নিয়মিত তরঙ্গরূপ এ রূপান্তর করা
 খুবই কঠিন, এমনকি সাহারণ spectrogram কে নিয়মিত তরঙ্গরূপ এ রূপান্তর করা এর চেয়ে সহজ, কারণ এর জন্য যেই ফ্রিকোয়েন্সিগুলিকে ফেলে দেওয়া হয়েছিল সেগুলোকে

diff --git a/chapters/bn/chapter1/preprocessing.mdx b/chapters/bn/chapter1/preprocessing.mdx
@@ -60,14 +60,13 @@ minds[0]
 আপনি লক্ষ্য করতে পারেন যে অ্যারের মানগুলিও এখন ভিন্ন। এর কারণ হল আমরা এখন এর জন্য amplitude মানগুলির দ্বিগুণ সংখ্যা পেয়েছি
 প্রতিটি যে আমরা আগে ছিল.
 
-<Tip>
-💡 Resampling সম্পর্কে কিছু তথ্য: যদি একটি অডিও সিগন্যাল ৮ kHz এ নমুনা নেওয়া হয়, যাতে প্রতি সেকেন্ডে ৮০০০ নমুনা রিডিং হয়, আমরা জানি যে অডিওতে
-৪ kHz এর বেশি ফ্রিকোয়েন্সি নেই, এটি Nyquist sampling theorem দ্বারা নিশ্চিত করা হয়। এই কারণে, আমরা নিশ্চিত হতে পারি যে স্যাম্পলিং পয়েন্টগুলির মধ্যে
-সর্বদা মূল অবিচ্ছিন্ন সংকেত থাকে যা একটি মসৃণ বক্ররেখা তৈরি করে। upsampling করার মানে তখন, অতিরিক্ত নমুনার মান এই বক্ররেখা আনুমান করে গণনা করা।
-এই মানগুলি আগে থেকে উপস্থিত যমুনার মান এর সাহায্যে গণনা করা হয়।  downsampling এর জন্যে আমরা প্রথমে যেকোনো ফ্রিকোয়েন্সি ফিল্টার আউট করি যা
-Nyquist সীমার চেয়ে বেশি, তারপর নতুন নমুনা গণনা করি। অন্য কথায়, আপনি প্রতি দ্বিতীয় নমুনাকে ছুঁড়ে ফেলার মাধ্যমে ২x ফ্যাক্টর এ downsample করতে
-পারবেন না - এটি সিগন্যালে বিকৃতি তৈরি করবে। resampling সঠিকভাবে করা কঠিন এবং ভাল-পরীক্ষিত লাইব্রেরি যেমন librosa বা 🤗 datasets এর উপর ছেড়ে দেওয়াই ভালো।
-</Tip>
+> [!TIP]
+> 💡 Resampling সম্পর্কে কিছু তথ্য: যদি একটি অডিও সিগন্যাল ৮ kHz এ নমুনা নেওয়া হয়, যাতে প্রতি সেকেন্ডে ৮০০০ নমুনা রিডিং হয়, আমরা জানি যে অডিওতে
+> ৪ kHz এর বেশি ফ্রিকোয়েন্সি নেই, এটি Nyquist sampling theorem দ্বারা নিশ্চিত করা হয়। এই কারণে, আমরা নিশ্চিত হতে পারি যে স্যাম্পলিং পয়েন্টগুলির মধ্যে
+> সর্বদা মূল অবিচ্ছিন্ন সংকেত থাকে যা একটি মসৃণ বক্ররেখা তৈরি করে। upsampling করার মানে তখন, অতিরিক্ত নমুনার মান এই বক্ররেখা আনুমান করে গণনা করা।
+> এই মানগুলি আগে থেকে উপস্থিত যমুনার মান এর সাহায্যে গণনা করা হয়।  downsampling এর জন্যে আমরা প্রথমে যেকোনো ফ্রিকোয়েন্সি ফিল্টার আউট করি যা
+> Nyquist সীমার চেয়ে বেশি, তারপর নতুন নমুনা গণনা করি। অন্য কথায়, আপনি প্রতি দ্বিতীয় নমুনাকে ছুঁড়ে ফেলার মাধ্যমে ২x ফ্যাক্টর এ downsample করতে
+> পারবেন না - এটি সিগন্যালে বিকৃতি তৈরি করবে। resampling সঠিকভাবে করা কঠিন এবং ভাল-পরীক্ষিত লাইব্রেরি যেমন librosa বা 🤗 datasets এর উপর ছেড়ে দেওয়াই ভালো।
 
 ## ডেটাসেট ফিল্টার করা
 

diff --git a/chapters/en/chapter0/get_ready.mdx b/chapters/en/chapter0/get_ready.mdx
@@ -27,16 +27,13 @@ To go through the course materials you will need:
 - A computer with an internet connection
 - [Google Colab](https://colab.research.google.com) for hands-on exercises. The free version is enough. If you have never used Google Colab before, check out this [official introduction notebook](https://colab.research.google.com/notebooks/intro.ipynb).
 
-<Tip>
-
-As an alternative to the free tier of Google Colab, you can use your own local setup, or Kaggle Notebooks. Kaggle Notebooks 
-offer a fixed number of GPU hours and have similar functionality to Google Colab, however, there are differences when it 
-comes to sharing your models on 🤗 Hub (e.g. for completing assignments). If you decide to use Kaggle Notebooks as your 
-tool of choice, check out the [example Kaggle notebook](https://www.kaggle.com/code/michaelshekasta/test-notebook) created by 
-[@michaelshekasta](https://github.com/michaelshekasta). This notebook illustrates how you can train and share your 
-trained model on 🤗 Hub.
-
-</Tip>
+> [!TIP]
+> As an alternative to the free tier of Google Colab, you can use your own local setup, or Kaggle Notebooks. Kaggle Notebooks 
+> offer a fixed number of GPU hours and have similar functionality to Google Colab, however, there are differences when it 
+> comes to sharing your models on 🤗 Hub (e.g. for completing assignments). If you decide to use Kaggle Notebooks as your 
+> tool of choice, check out the [example Kaggle notebook](https://www.kaggle.com/code/michaelshekasta/test-notebook) created by 
+> [@michaelshekasta](https://github.com/michaelshekasta). This notebook illustrates how you can train and share your 
+> trained model on 🤗 Hub.
 
 ## Step 5. Join the community
 

diff --git a/chapters/en/chapter1/audio_data.mdx b/chapters/en/chapter1/audio_data.mdx
@@ -178,10 +178,9 @@ You used `librosa.amplitude_to_db()` to convert the amplitude values to the deci
 the finer details in the spectrum. Sometimes people use the **power spectrum**, which measures energy rather than amplitude;
 this is simply a spectrum with the amplitude values squared.
 
-<Tip>
-💡 In practice, people use the term FFT interchangeably with DFT, as the FFT or Fast Fourier Transform is the only efficient
-way to calculate the DFT on a computer.
-</Tip>
+> [!TIP]
+> 💡 In practice, people use the term FFT interchangeably with DFT, as the FFT or Fast Fourier Transform is the only efficient
+> way to calculate the DFT on a computer.
 
 The frequency spectrum of an audio signal contains the exact same information as its waveform — they are simply two different
 ways of looking at the same data (here, the first 4096 samples from the trumpet sound). Where the waveform plots the amplitude
@@ -276,12 +275,11 @@ Just as with a regular spectrogram, it's common practice to express the strength
 decibels. This is commonly referred to as a **log-mel spectrogram**, because the conversion to decibels involves a
 logarithmic operation. The above example used `librosa.power_to_db()` as `librosa.feature.melspectrogram()` creates a power spectrogram.
 
-<Tip>
-💡 Not all mel spectrograms are the same! There are two different mel scales in common use ("htk" and "slaney"),
-and instead of the power spectrogram the amplitude spectrogram may be used. The conversion to a log-mel spectrogram doesn't
-always compute true decibels but may simply take the `log`. Therefore, if a machine learning model expects a mel spectrogram
-as input, double check to make sure you're computing it the same way.
-</Tip>
+> [!TIP]
+> 💡 Not all mel spectrograms are the same! There are two different mel scales in common use ("htk" and "slaney"),
+> and instead of the power spectrogram the amplitude spectrogram may be used. The conversion to a log-mel spectrogram doesn't
+> always compute true decibels but may simply take the `log`. Therefore, if a machine learning model expects a mel spectrogram
+> as input, double check to make sure you're computing it the same way.
 
 Creating a mel spectrogram is a lossy operation as it involves filtering the signal. Converting a mel spectrogram back
 into a waveform is more difficult than doing this for a regular spectrogram, as it requires estimating the frequencies

diff --git a/chapters/en/chapter1/preprocessing.mdx b/chapters/en/chapter1/preprocessing.mdx
@@ -61,16 +61,15 @@ minds[0]
 You may notice that the array values are now also different. This is because we've now got twice the number of amplitude values for
 every one that we had before.
 
-<Tip>
-💡 Some background on resampling: If an audio signal has been sampled at 8 kHz, so that it has 8000 sample readings per
-second, we know that the audio does not contain any frequencies over 4 kHz. This is guaranteed by the Nyquist sampling
-theorem. Because of this, we can be certain that in between the sampling points the original continuous signal always
-makes a smooth curve. Upsampling to a higher sampling rate is then a matter of calculating additional sample values that go in between
-the existing ones, by approximating this curve. Downsampling, however, requires that we first filter out any frequencies
-that would be higher than the new Nyquist limit, before estimating the new sample points. In other words, you can't
-downsample by a factor 2x by simply throwing away every other sample — this will create distortions in the signal called
-aliases. Doing resampling correctly is tricky and best left to well-tested libraries such as librosa or 🤗 Datasets.
-</Tip>
+> [!TIP]
+> 💡 Some background on resampling: If an audio signal has been sampled at 8 kHz, so that it has 8000 sample readings per
+> second, we know that the audio does not contain any frequencies over 4 kHz. This is guaranteed by the Nyquist sampling
+> theorem. Because of this, we can be certain that in between the sampling points the original continuous signal always
+> makes a smooth curve. Upsampling to a higher sampling rate is then a matter of calculating additional sample values that go in between
+> the existing ones, by approximating this curve. Downsampling, however, requires that we first filter out any frequencies
+> that would be higher than the new Nyquist limit, before estimating the new sample points. In other words, you can't
+> downsample by a factor 2x by simply throwing away every other sample — this will create distortions in the signal called
+> aliases. Doing resampling correctly is tricky and best left to well-tested libraries such as librosa or 🤗 Datasets.
 
 ## Filtering the dataset
 

diff --git a/chapters/en/chapter3/classification.mdx b/chapters/en/chapter3/classification.mdx
@@ -18,9 +18,8 @@ Just like ViT, the AST model splits the audio spectrogram into a sequence of par
 
 Image from the paper [AST: Audio Spectrogram Transformer](https://arxiv.org/pdf/2104.01778.pdf)
 
-<Tip>
-💡 Even though here we pretend spectrograms are the same as images, there are important differences. For example, shifting the contents of an image up or down generally does not change the meaning of what is in the image. However, shifting a spectrogram up or down will change the frequencies that are in the sound and completely change its character. Images are invariant under translation but spectrograms are not. Treating spectrograms as images can work very well in practice, but keep in mind they are not really the same thing.
-</Tip>
+> [!TIP]
+> 💡 Even though here we pretend spectrograms are the same as images, there are important differences. For example, shifting the contents of an image up or down generally does not change the meaning of what is in the image. However, shifting a spectrogram up or down will change the frequencies that are in the sound and completely change its character. Images are invariant under translation but spectrograms are not. Treating spectrograms as images can work very well in practice, but keep in mind they are not really the same thing.
 
 ## Any transformer can be a classifier