New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LB-113: Update robots.txt for Scrobble API endpoint #106
Conversation
@@ -2,3 +2,5 @@ User-agent: * | |||
Disallow: /current-status | |||
Disallow: /user/ | |||
Disallow: /1/ | |||
Disallow: /api | |||
Disallow: /*lastfm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this line needed?
You also missed the URL for |
@pinkeshbadjatiya Made the changes |
@@ -2,3 +2,5 @@ User-agent: * | |||
Disallow: /current-status | |||
Disallow: /user/ | |||
Disallow: /1/ | |||
Disallow: /api |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for using /api
here and not /api/
? (I'm a bit rusty on my robots.txt syntax.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a file /api.py which needs to be blocked for bots . /api means every file will be blocked with '/api' where '/api/' will block /api/ folder (which is not present in LB) not a file :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are blocking URLs, not files or folders. It is better to have a trailing slash everywhere by default unless you are intentionally blocking prefixes like /apistuff
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I didn't know about robots.txt when I googled about it it said /filename /foldername/ so i thought robots.txt worked like that but now i came to know that it blocks urls... I have changed the file now. Thanks :D
This looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Gentlecat's last comment -- use trailing backslashes for directories. Then block specific scripts with explicit lines. Make this fix and this should be good to go.
@mayhem @gentlecat I have made the changes :) |
Changed the robots.txt file to block path to new Lastfm API and Auth pages
Issue Link: https://tickets.metabrainz.org/browse/LB-113