Special tokens are called special because they are not derived from your input. They are added for a certain purpose and are independent of the specific input.
What I don't understand is, under what kind of capacity will you want
to create a new special token, any examples what we need it for and
when we want to create a special token other than those default
special tokens?
Just an example, in extractive conversational question-answering it is not unusual to add the question and answer of the previous dialog-turn to your input to provide some context for your model. Those previous dialog turns are separated with special tokens from the current question. Sometimes people use the separator token of the model or introduce new special tokens. The following is an example with a new special token [Q]
#first dialog turn - no conversation history
[CLS] current question [SEP] text [EOS]
#second dialog turn - with previous question to have some context
[CLS] previous question [Q] current question [SEP] text [EOS]
And I also dont quite understand the following description in the
source documentation. what difference does it do to our model if we
set add_special_tokens to False?
from transformers import RobertaTokenizer
t = RobertaTokenizer.from_pretrained("roberta-base")
t("this is an example")
#{'input_ids': [0, 9226, 16, 41, 1246, 2], 'attention_mask': [1, 1, 1, 1, 1, 1]}
t("this is an example", add_special_tokens=False)
#{'input_ids': [9226, 16, 41, 1246], 'attention_mask': [1, 1, 1, 1]}
As you can see here, the input misses two tokens (the special tokens). Those special tokens have a meaning for your model since it was trained with it. The last_hidden_state will be different due to the lack of those two tokens and will therefore lead to a different result for your downstream task.
Some tasks, like sequence classification, often use the [CLS] token to make their predictions. When you remove them, a model that was pre-trained with a [CLS] token will struggle.