| Title: | Twitter/X Scraping via Python's 'twscrape' Library |
|---|---|
| Description: | A comprehensive R interface to Python's 'twscrape' library for scraping Twitter/X data. This package uses 'reticulate' to provide a seamless R interface to the fully functional Python 'twscrape' library. Supports searching tweets, user timelines, followers, and more, with built-in rate limiting and multi-account support. Built on top of 'twscrape' by vladkens <https://github.com/vladkens/twscrape> and inspired by 'snscrape' by JustAnotherArchivist <https://github.com/JustAnotherArchivist/snscrape>. |
| Authors: | Agustin Nieto [aut, cre], Claude AI [ctb] (Package development assistance) |
| Maintainer: | Agustin Nieto <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.3 |
| Built: | 2026-06-04 23:11:46 UTC |
| Source: | https://github.com/agusnieto77/twscraper |
Agrega una cuenta de Twitter con cookies y la activa automaticamente. IMPORTANTE: Las cookies son obligatorias para activar la cuenta correctamente.
add_account( username, password, email, email_password, cookies, db_file = "accounts.db", verbose = TRUE )add_account( username, password, email, email_password, cookies, db_file = "accounts.db", verbose = TRUE )
username |
Nombre de usuario |
password |
Contrasena |
email |
Email de la cuenta |
email_password |
Contrasena del email |
cookies |
Cookies de sesion (OBLIGATORIO - formato: "auth_token=...; ct0=...") |
db_file |
Archivo de base de datos (default: "accounts.db") |
verbose |
Si 'TRUE', emite mensajes informativos con 'message()'/'warning()'; puede silenciarse con 'suppressMessages()'/'suppressWarnings()'. |
Invisibly returns 'TRUE' if the account was added and is active; invisibly returns 'FALSE' if it was not added, already existed, or could not be activated.
if (check_setup() && interactive()) { # add_account( # username = "your_username", # password = "your_password", # email = "[email protected]", # email_password = "your_email_password", # cookies = "auth_token=...; ct0=..." # ) }if (check_setup() && interactive()) { # add_account( # username = "your_username", # password = "your_password", # email = "[email protected]", # email_password = "your_email_password", # cookies = "auth_token=...; ct0=..." # ) }
Lee credenciales desde '.Renviron' o variables de entorno y agrega la cuenta sin exponer secretos en el codigo.
add_account_from_env(prefix = "TWS_", db_file = "accounts.db", verbose = TRUE)add_account_from_env(prefix = "TWS_", db_file = "accounts.db", verbose = TRUE)
prefix |
Prefijo de las variables de entorno (default: "TWS_") |
db_file |
Archivo de base de datos (default: "accounts.db") |
verbose |
Si 'TRUE', emite mensajes informativos con 'message()'/'warning()'; puede silenciarse con 'suppressMessages()'/'suppressWarnings()'. |
'TRUE' si la cuenta se agrego y activo; 'FALSE' si se agrego pero no quedo activa o hubo error
# En .Renviron: # TWS_USERNAME='mi_usuario' # TWS_PASSWORD='mi_password' # TWS_EMAIL='[email protected]' # TWS_EMAIL_PASSWORD='email_pass' # TWS_AUTH_TOKEN='auth_token_del_navegador' # TWS_CT0='ct0_del_navegador' if (check_setup() && interactive()) { add_account_from_env() # Para una segunda cuenta, usa otro prefijo: # TWS2_USERNAME='otra_cuenta' # TWS2_PASSWORD='otra_password' # ... add_account_from_env(prefix = "TWS2_") }# En .Renviron: # TWS_USERNAME='mi_usuario' # TWS_PASSWORD='mi_password' # TWS_EMAIL='[email protected]' # TWS_EMAIL_PASSWORD='email_pass' # TWS_AUTH_TOKEN='auth_token_del_navegador' # TWS_CT0='ct0_del_navegador' if (check_setup() && interactive()) { add_account_from_env() # Para una segunda cuenta, usa otro prefijo: # TWS2_USERNAME='otra_cuenta' # TWS2_PASSWORD='otra_password' # ... add_account_from_env(prefix = "TWS2_") }
Convierte un usuario individual a data.frame de una fila
## S3 method for class 'twscraper_user' as.data.frame(x, ...)## S3 method for class 'twscraper_user' as.data.frame(x, ...)
x |
Usuario de get_user() |
... |
Argumentos adicionales |
data.frame
Verifica si Python y twscrape estan configurados en la sesion actual.
check_setup(...)check_setup(...)
... |
Ignorado; reservado para compatibilidad futura. |
'TRUE' si twscrapeR esta configurado en la sesion actual; 'FALSE' en caso contrario.
check_setup()check_setup()
Elimina una cuenta de la base de datos
delete_account(username, db_file = "accounts.db", verbose = TRUE)delete_account(username, db_file = "accounts.db", verbose = TRUE)
username |
Nombre de usuario a eliminar |
db_file |
Archivo de base de datos (default: "accounts.db") |
verbose |
Si 'TRUE', emite mensajes informativos con 'message()'/'warning()'; puede silenciarse con 'suppressMessages()'/'suppressWarnings()'. |
Invisibly returns 'TRUE' if the account was deleted; invisibly returns 'FALSE' otherwise.
if (check_setup() && interactive()) { # delete_account("your_username") }if (check_setup() && interactive()) { # delete_account("your_username") }
Filtra tweets por rango de fechas
filter_by_date(tweets, from = NULL, to = NULL)filter_by_date(tweets, from = NULL, to = NULL)
tweets |
Lista de tweets |
from |
Fecha inicial (POSIXct o character) |
to |
Fecha final (POSIXct o character) |
Lista filtrada de tweets
tweets <- list( list(id = "1", date = as.POSIXct("2026-01-01")), list(id = "2", date = as.POSIXct("2026-02-01")) ) class(tweets) <- c("twscraper_tweets", "list") tweets_recent <- filter_by_date(tweets, from = "2025-10-01")tweets <- list( list(id = "1", date = as.POSIXct("2026-01-01")), list(id = "2", date = as.POSIXct("2026-02-01")) ) class(tweets) <- c("twscraper_tweets", "list") tweets_recent <- filter_by_date(tweets, from = "2025-10-01")
Filtra tweets por idioma
filter_by_lang(tweets, lang)filter_by_lang(tweets, lang)
tweets |
Lista de tweets |
lang |
Codigo de idioma (ej: "es", "en", "pt") |
Lista filtrada de tweets
tweets <- list( list(id = "1", lang = "es"), list(id = "2", lang = "en") ) class(tweets) <- c("twscraper_tweets", "list") tweets_es <- filter_by_lang(tweets, "es")tweets <- list( list(id = "1", lang = "es"), list(id = "2", lang = "en") ) class(tweets) <- c("twscraper_tweets", "list") tweets_es <- filter_by_lang(tweets, "es")
Obtiene la lista de seguidores de un usuario
get_followers(username, n = 100, progress = TRUE)get_followers(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de seguidores (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de usuarios
if (check_setup()) { followers <- get_followers("elonmusk", n = 100) }if (check_setup()) { followers <- get_followers("elonmusk", n = 100) }
Obtiene la lista de usuarios que sigue un usuario
get_following(username, n = 100, progress = TRUE)get_following(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de usuarios (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de usuarios
if (check_setup()) { following <- get_following("elonmusk", n = 100) }if (check_setup()) { following <- get_following("elonmusk", n = 100) }
Obtiene la lista de usuarios que retuitearon un tweet especifico
get_retweeters(tweet_id, n = 100, progress = TRUE)get_retweeters(tweet_id, n = 100, progress = TRUE)
tweet_id |
ID del tweet |
n |
Numero maximo de usuarios (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de usuarios
if (check_setup()) { retweeters <- get_retweeters(1234567890, n = 100) }if (check_setup()) { retweeters <- get_retweeters(1234567890, n = 100) }
Obtiene usuarios que retuitearon varios tweets en una sola llamada de R.
get_retweeters_batch(tweets, n = 100, progress = TRUE, flatten = TRUE)get_retweeters_batch(tweets, n = 100, progress = TRUE, flatten = TRUE)
tweets |
Vector de IDs, lista de tweets 'twscraper_tweets', tweet individual o data.frame con columna 'id'/'tweet_id'. |
n |
Numero maximo de usuarios por tweet (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
flatten |
Si es TRUE, devuelve una lista plana de usuarios con columna 'source_tweet_id' para usar con 'to_dataframe()'. Si es FALSE, devuelve una lista agrupada por tweet. |
Lista de usuarios con 'source_tweet_id' o lista agrupada por tweet
if (check_setup()) { tweets <- search_tweets("rstats", n = 10) retweeters <- get_retweeters_batch(tweets, n = 50) retweeters_df <- to_dataframe(retweeters) retweeters_by_tweet <- get_retweeters_batch(tweets, n = 50, flatten = FALSE) }if (check_setup()) { tweets <- search_tweets("rstats", n = 10) retweeters <- get_retweeters_batch(tweets, n = 50) retweeters_df <- to_dataframe(retweeters) retweeters_by_tweet <- get_retweeters_batch(tweets, n = 50, flatten = FALSE) }
Obtiene informacion detallada de un usuario
get_user(username, progress = TRUE)get_user(username, progress = TRUE)
username |
Nombre de usuario (sin @) |
progress |
Mostrar progreso (default: TRUE) |
Lista con informacion del usuario
if (check_setup()) { user <- get_user("hadleywickham") }if (check_setup()) { user <- get_user("hadleywickham") }
Lista todas las cuentas configuradas
list_accounts(db_file = "accounts.db", verbose = TRUE)list_accounts(db_file = "accounts.db", verbose = TRUE)
db_file |
Archivo de base de datos (default: "accounts.db") |
verbose |
Si 'TRUE', emite mensajes informativos con 'message()'/'warning()'; puede silenciarse con 'suppressMessages()'/'suppressWarnings()'. |
Invisibly returns a list of configured accounts. Each account contains 'username', 'email', 'active', and 'locks'; an empty list indicates no accounts or an error.
if (check_setup()) { accounts <- list_accounts() print(accounts) }if (check_setup()) { accounts <- list_accounts() print(accounts) }
Imprime informacion de un tweet individual
## S3 method for class 'twscraper_tweet' print(x, ...)## S3 method for class 'twscraper_tweet' print(x, ...)
x |
Informacion de un tweet |
... |
Argumentos adicionales |
Invisibly returns 'x', a 'twscraper_tweet' list with tweet metadata, author fields, engagement counts, language, and URL.
Imprime resumen de tweets
## S3 method for class 'twscraper_tweets' print(x, ...)## S3 method for class 'twscraper_tweets' print(x, ...)
x |
Lista de tweets |
... |
Argumentos adicionales |
Invisibly returns 'x', a 'twscraper_tweets' list. Each element represents one tweet with fields such as 'id', 'date', 'text', 'username', engagement counts, language, and URL.
Imprime informacion de usuario
## S3 method for class 'twscraper_user' print(x, ...)## S3 method for class 'twscraper_user' print(x, ...)
x |
Informacion de usuario |
... |
Argumentos adicionales |
Invisibly returns 'x', a 'twscraper_user' list with fields such as 'id', 'username', 'displayname', 'description', follower/following counts, verification status, location, URL, and profile image URL.
Imprime resumen de usuarios
## S3 method for class 'twscraper_users' print(x, ...)## S3 method for class 'twscraper_users' print(x, ...)
x |
Lista de usuarios |
... |
Argumentos adicionales |
Invisibly returns 'x', a 'twscraper_users' list. Each element represents one user with profile metadata and account statistics.
Guarda tweets en formato CSV
save_csv(tweets, file)save_csv(tweets, file)
tweets |
Lista de tweets |
file |
Nombre del archivo |
Invisibly returns the output file path. The CSV contains one row per tweet/user-like item after conversion with 'to_dataframe()'.
tweets <- list(list(id = "1", text = "Hello R", username = "rstats")) save_csv(tweets, tempfile(fileext = ".csv"))tweets <- list(list(id = "1", text = "Hello R", username = "rstats")) save_csv(tweets, tempfile(fileext = ".csv"))
Guarda tweets en formato JSON
save_json(tweets, file)save_json(tweets, file)
tweets |
Lista de tweets |
file |
Nombre del archivo |
Invisibly returns the output file path. The JSON file contains the list structure supplied in 'tweets', written with 'jsonlite::write_json()'.
tweets <- list(list(id = "1", text = "Hello R", username = "rstats")) save_json(tweets, tempfile(fileext = ".json"))tweets <- list(list(id = "1", text = "Hello R", username = "rstats")) save_json(tweets, tempfile(fileext = ".json"))
Busca tweets que contengan un hashtag especifico
search_hashtag(hashtag, n = 100, progress = TRUE)search_hashtag(hashtag, n = 100, progress = TRUE)
hashtag |
Hashtag (con o sin #) |
n |
Numero maximo de tweets (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets
if (check_setup()) { tweets <- search_hashtag("rstats", n = 50) tweets <- search_hashtag("#datascience", n = 100) }if (check_setup()) { tweets <- search_hashtag("rstats", n = 50) tweets <- search_hashtag("#datascience", n = 100) }
Busca tweets que mencionen a un usuario especifico
search_mentions(username, n = 100, progress = TRUE)search_mentions(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de tweets (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets
if (check_setup()) { mentions <- search_mentions("hadleywickham", n = 50) }if (check_setup()) { mentions <- search_mentions("hadleywickham", n = 50) }
Busca tweets usando una consulta
search_tweets(query, n = 100, progress = TRUE, product = c("Latest", "Top"))search_tweets(query, n = 100, progress = TRUE, product = c("Latest", "Top"))
query |
Consulta de busqueda (ej: "rstats", "#datascience", "from:usuario") |
n |
Numero maximo de tweets a retornar (default: 100) |
progress |
Mostrar barra de progreso (default: TRUE) |
product |
Tipo de busqueda: "Latest" para tweets recientes o "Top" para tweets destacados (default: "Latest") |
Lista de tweets
if (check_setup()) { tweets <- search_tweets("rstats", n = 50) top_tweets <- search_tweets("rstats", n = 50, product = "Top") df <- to_dataframe(tweets) }if (check_setup()) { tweets <- search_tweets("rstats", n = 50) top_tweets <- search_tweets("rstats", n = 50, product = "Top") df <- to_dataframe(tweets) }
Checks that Python and the Python 'twscrape' package are available and configures the active reticulate session. It does not install Python or Python packages automatically.
setup_twscraper(python_path = NULL, install_python = TRUE, ask = TRUE)setup_twscraper(python_path = NULL, install_python = TRUE, ask = TRUE)
python_path |
Ruta a Python (opcional, se detecta automaticamente) |
install_python |
Deprecated compatibility argument. Ignored; installation must be performed by the user outside package functions. |
ask |
Deprecated compatibility argument. Ignored; retained to avoid breaking existing calls. |
Invisibly returns 'TRUE' when Python >= 3.10 and the Python 'twscrape' module are available and configured. Invisibly returns 'FALSE' otherwise, after printing installation instructions.
if (interactive()) { setup_twscraper() }if (interactive()) { setup_twscraper() }
Ordena tweets por algun criterio
sort_tweets(tweets, by = "date", decreasing = TRUE)sort_tweets(tweets, by = "date", decreasing = TRUE)
tweets |
Lista de tweets |
by |
Campo por el cual ordenar ("date", "like_count", "retweet_count", "views_count") |
decreasing |
Orden descendente (default: TRUE) |
Lista ordenada de tweets
tweets <- list( list(id = "1", date = as.POSIXct("2026-01-01"), like_count = 2), list(id = "2", date = as.POSIXct("2026-02-01"), like_count = 10) ) class(tweets) <- c("twscraper_tweets", "list") top_tweets <- sort_tweets(tweets, by = "like_count")tweets <- list( list(id = "1", date = as.POSIXct("2026-01-01"), like_count = 2), list(id = "2", date = as.POSIXct("2026-02-01"), like_count = 10) ) class(tweets) <- c("twscraper_tweets", "list") top_tweets <- sort_tweets(tweets, by = "like_count")
Convierte una lista de tweets o usuarios a data.frame usando purrr::map_dfr
to_dataframe(x)to_dataframe(x)
x |
Lista de tweets o usuarios |
data.frame o tibble
tweets <- list( list( id = "1", date = as.POSIXct("2026-01-01"), text = "Hello R", username = "rstats", user_displayname = "R Stats", user_id = "10", reply_count = 0, retweet_count = 1, like_count = 2, quote_count = 0, views_count = 100, lang = "en", url = "https://example.com/1", user_followers = 1000, user_verified = FALSE ) ) class(tweets) <- c("twscraper_tweets", "list") df <- to_dataframe(tweets) users <- list(list( id = "10", username = "rstats", displayname = "R Stats", description = "Example user", followers_count = 1000, following_count = 50, tweets_count = 200, verified = FALSE, created = "2026-01-01", location = "", url = "", profile_image_url = "" )) class(users) <- c("twscraper_users", "list") users_df <- to_dataframe(users) users[[1]]$source_tweet_id <- "1" retweeters_df <- to_dataframe(users) # incluye source_tweet_idtweets <- list( list( id = "1", date = as.POSIXct("2026-01-01"), text = "Hello R", username = "rstats", user_displayname = "R Stats", user_id = "10", reply_count = 0, retweet_count = 1, like_count = 2, quote_count = 0, views_count = 100, lang = "en", url = "https://example.com/1", user_followers = 1000, user_verified = FALSE ) ) class(tweets) <- c("twscraper_tweets", "list") df <- to_dataframe(tweets) users <- list(list( id = "10", username = "rstats", displayname = "R Stats", description = "Example user", followers_count = 1000, following_count = 50, tweets_count = 200, verified = FALSE, created = "2026-01-01", location = "", url = "", profile_image_url = "" )) class(users) <- c("twscraper_users", "list") users_df <- to_dataframe(users) users[[1]]$source_tweet_id <- "1" retweeters_df <- to_dataframe(users) # incluye source_tweet_id
Obtiene informacion detallada de un tweet especifico por su ID
tweet_details(tweet_id, progress = TRUE)tweet_details(tweet_id, progress = TRUE)
tweet_id |
ID del tweet (numero o string) |
progress |
Mostrar progreso (default: TRUE) |
Lista con informacion del tweet o NULL si no se encuentra
if (check_setup()) { tweet <- tweet_details(1234567890) }if (check_setup()) { tweet <- tweet_details(1234567890) }
Obtiene las respuestas (replies) a un tweet especifico
tweet_replies(tweet_id, n = 100, progress = TRUE)tweet_replies(tweet_id, n = 100, progress = TRUE)
tweet_id |
ID del tweet |
n |
Numero maximo de respuestas (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets (respuestas)
if (check_setup()) { replies <- tweet_replies(1234567890, n = 50) }if (check_setup()) { replies <- tweet_replies(1234567890, n = 50) }
Obtiene solo los tweets que contienen imagenes o videos
user_media(username, n = 100, progress = TRUE)user_media(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de tweets (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets
if (check_setup()) { media_tweets <- user_media("elonmusk", n = 100) }if (check_setup()) { media_tweets <- user_media("elonmusk", n = 100) }
Obtiene los tweets recientes de un usuario especifico
user_tweets(username, n = 100, progress = TRUE)user_tweets(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de tweets (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets
if (check_setup()) { tweets <- user_tweets("hadleywickham", n = 50) }if (check_setup()) { tweets <- user_tweets("hadleywickham", n = 50) }
Obtiene tweets y respuestas de un usuario (timeline completo)
user_tweets_and_replies(username, n = 100, progress = TRUE)user_tweets_and_replies(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de tweets (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de tweets
if (check_setup()) { all_tweets <- user_tweets_and_replies("elonmusk", n = 100) }if (check_setup()) { all_tweets <- user_tweets_and_replies("elonmusk", n = 100) }
Obtiene solo los seguidores que tienen cuenta verificada
verified_followers(username, n = 100, progress = TRUE)verified_followers(username, n = 100, progress = TRUE)
username |
Nombre de usuario (sin @) |
n |
Numero maximo de seguidores (default: 100) |
progress |
Mostrar progreso (default: TRUE) |
Lista de usuarios verificados
if (check_setup()) { verified <- verified_followers("elonmusk", n = 100) }if (check_setup()) { verified <- verified_followers("elonmusk", n = 100) }