Skip to content

Web Scraping with Selenium / SeleniumBase / Undetected Chromedriver and Cython+Pandas

License

Notifications You must be signed in to change notification settings

hansalemaos/cythonselenium

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fast Web Scraping with Python and Selenium / SeleniumBase / Undetected Chromedriver

pip install cythonselenium

Cython and a C/C++ compiler must be installed! The module will be compiled the first time you import it!

  • Provides a structured DataFrame format for web elements, simplifying data manipulation.
  • Seamlessly integrates with Selenium / SeleniumBase WebDriver for robust browser interactions.
  • Capable of handling dynamic content and content within iframes, crucial for modern web applications.
  • Enables execution of JavaScript functions directly on web elements.
  • Offers customizable CSS selectors, waiting conditions, and method attachments for flexible web scraping.
  • Incorporates explicit waits to enhance the reliability of web interactions.
  • Supports repeat queries for dealing with asynchronously loaded content.
  • Automatically switches to the correct frame before performing actions like clicking, ensuring seamless interaction with elements across multiple frames.
  • Retrieves all elements in a single request to optimize performance and reduce the load on the web server.
# Attributes:
#     driver (WebDriver): The Selenium WebDriver instance used to control the browser.
#     By (By): Selenium By class used to locate elements on a web page.
#     WebDriverWait (WebDriverWait): Selenium WebDriverWait class used for implementing explicit waits.
#     expected_conditions (expected_conditions): Module in Selenium used to set expected conditions for explicit waits.
#     queryselector (str): CSS selector used to query and return elements from the DOM. Defaults to '*' which selects all elements.
#     repeat_until_element_in_columns (optional): Specific element to be checked for its presence in the dataframe columns before stopping the query. Useful for waiting on AJAX or dynamically loaded content.
#     max_repeats (int): Maximum number of iterations to perform when checking for the presence of 'repeat_until_element_in_columns'. Defaults to 1.
#     with_methods (bool): Flag to determine if JavaScript methods should be attached to the elements in the resulting dataframe. Defaults to True.

# Methods:
#     __call__(queryselector=None, with_methods=None, repeat_until_element_in_columns=None, max_repeats=None, driver=None, By=None, WebDriverWait=None, expected_conditions=None):
#         Generates a dataframe of web elements based on the specified query selector. The dataframe can include methods attached to these elements if 'with_methods' is True. This method allows for overriding class attributes during its call for flexibility in querying different elements without needing to create multiple instances of the class.

from cythonselenium import SeleniumFrame
from seleniumbase import Driver
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
import subprocess
from time import sleep

# for SeleniumBase zombie processes
subprocess.run("taskkill /IM uc_driver.exe /F", shell=True)
sleep(5)

max_col_width = 30


if __name__ == "__main__":
    driver = Driver(uc=True)
    getframe = SeleniumFrame(
        driver=driver,
        By=By,
        WebDriverWait=WebDriverWait,
        expected_conditions=expected_conditions,
        queryselector="*",
        repeat_until_element_in_columns=None,
        max_repeats=1,
        with_methods=True,
    )
    driver.get(r"https://www.whitehouse.gov/")

    df = getframe("*")
    # Use https://github.com/directvt/vtm for a line break free experience
    print(df)

    # get only links - query selector "a"
    df2 = getframe("a")

    # Selector
    df3=df2.loc[df2.aa_text=='News'] 

    # Click on the item, without worrying about switching to the correct frame
    # JS click - always works
    df3.iloc[0].js_click()
    # Selenium Click
    df3.iloc[0].se_click()

Example of a DataFrame (cropped to 50 chars)

index element frame elements_in_frame aa_relList aa_text aa_origin aa_host aa_hostname aa_pathname aa_hash aa_href aa_offsetParent aa_offsetTop aa_offsetLeft aa_offsetWidth aa_offsetHeight aa_innerText aa_outerText aa_className aa_classList aa_innerHTML aa_outerHTML aa_scrollWidth aa_scrollHeight aa_clientWidth aa_clientHeight aa_nextElementSibling aa_parentNode aa_parentElement aa_firstChild aa_lastChild aa_nextSibling aa_textContent aa_rel aa_title aa_firstElementChild aa_lastElementChild aa_childElementCount aa_target js_toString js_attachInternals js_blur js_click js_focus js_hidePopover js_showPopover js_togglePopover js_after js_animate js_append js_attachShadow js_before js_checkVisibility js_closest js_computedStyleMap js_getAnimations js_getAttribute js_getAttributeNS js_getAttributeNames js_getAttributeNode js_getAttributeNodeNS js_getBoundingClientRect js_getClientRects js_getElementsByClassName js_getElementsByTagName js_getElementsByTagNameNS js_getHTML js_hasAttribute js_hasAttributeNS js_hasAttributes js_hasPointerCapture js_insertAdjacentElement js_insertAdjacentHTML js_insertAdjacentText js_matches js_prepend js_querySelector js_querySelectorAll js_releasePointerCapture js_remove js_removeAttribute js_removeAttributeNS js_removeAttributeNode js_replaceChildren js_replaceWith js_requestFullscreen js_requestPointerLock js_scroll js_scrollBy js_scrollIntoView js_scrollIntoViewIfNeeded js_scrollTo js_setAttribute js_setAttributeNS js_setAttributeNode js_setAttributeNodeNS js_setHTMLUnsafe js_setPointerCapture js_toggleAttribute js_webkitMatchesSelector js_webkitRequestFullScreen js_webkitRequestFullscreen js_appendChild js_cloneNode js_compareDocumentPosition js_contains js_getRootNode js_hasChildNodes js_insertBefore js_isDefaultNamespace js_isEqualNode js_isSameNode js_lookupNamespaceURI js_lookupPrefix js_normalize js_removeChild js_replaceChild js_addEventListener js_dispatchEvent js_removeEventListener js_wheel js_change_html_value se_send_keys se_find_elements se_find_element se_is_displayed se_is_enabled se_is_selected se_clear se_click se_switch_to_frame se_location_once_scrolled_into_view se_get_screenshot_as_file se_screenshot aa_window_handle aa_window_switch
0 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Skip to content https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov / #wp--skip-link--target https://www.whitehouse.gov/#wp--skip-link--target [object HTMLBodyElement] -1 -1 1 1 Skip to content Skip to content skip-link screen-reader-text skip-link screen-reader-text Skip to content <a class="skip-link screen-reader-text" href="#wp- 69 93 1 1 [object HTMLDivElement] [object HTMLBodyElement] [object HTMLBodyElement] [object Text] [object Text] [object HTMLDivElement] Skip to content () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
1 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 News https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /news/ https://www.whitehouse.gov/news/ [object HTMLDivElement] 289 93 50 NEWS NEWS News News</a 93 50 93 50 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] News () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
2 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Administration https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/ https://www.whitehouse.gov/administration/ [object HTMLDivElement] 382 181 50 ADMINISTRATION ADMINISTRATION Administration <a href="https://www.whitehouse.gov/administration 181 50 181 50 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Administration () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
3 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Issues https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /issues/ https://www.whitehouse.gov/issues/ [object HTMLDivElement] 563 103 50 ISSUES ISSUES Issues Issue 103 50 103 50 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Issues () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
4 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 home The White House\n\n \n President Donald https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov / https://www.whitehouse.gov/ [object HTMLDivElement] 221 150 THE WHITE HOUSE\nPRESIDENT DONALD J. TRUMP THE WHITE HOUSE\nPRESIDENT DONALD J. TRUMP \n <source type="image/webp" srcset <a href="https://www.whitehouse.gov" rel="home" ti 221 150 221 150 [object HTMLDivElement] [object HTMLDivElement] [object Text] [object Text] [object Text] The White House\n\n \n President Donald home The White House [object HTMLPictureElement] [object HTMLSpanElement] 3 () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
5 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 News https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /news/ https://www.whitehouse.gov/news/ News News News <a data-wp-on-async--mouseenter="callbacks.handleP [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] News () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
6 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Administration https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/ https://www.whitehouse.gov/administration/ Administration Administration Administration <a data-wp-on-async--mouseenter="callbacks.handleP [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Administration () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
7 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Issues https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /issues/ https://www.whitehouse.gov/issues/ Issues Issues Issues <a data-wp-on-async--mouseenter="callbacks.handleP [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Issues () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
8 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Contact https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /contact/ https://www.whitehouse.gov/contact/ Contact Contact Contact <a data-wp-on-async--mouseenter="callbacks.handleP [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Contact () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
9 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Visit https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /visit/ https://www.whitehouse.gov/visit/ Visit Visit Visit <a data-wp-on-async--mouseenter="callbacks.handleP [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Visit () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
10 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow X https://twitter.com twitter.com twitter.com /whitehouse https://twitter.com/whitehouse X X wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] X noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
11 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow Instagram https://www.instagram.com www.instagram.com www.instagram.com /whitehouse/ https://www.instagram.com/whitehouse/ Instagram Instagram wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] Instagram noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
12 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow Facebook https://www.facebook.com www.facebook.com www.facebook.com /WhiteHouse/ https://www.facebook.com/WhiteHouse/ Facebook Facebook wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] Facebook noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
13 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /presidential-actions/ https://www.whitehouse.gov/presidential-actions/ [object HTMLDivElement] 539 392 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 539 392 539 392 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
14 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /news/ https://www.whitehouse.gov/news/ [object HTMLDivElement] 539 392 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 539 392 539 392 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
15 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/donald-j-trump/ https://www.whitehouse.gov/administration/donald-j [object HTMLDivElement] 300 420 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 300 420 300 420 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
16 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/jd-vance/ https://www.whitehouse.gov/administration/jd-vance [object HTMLDivElement] 264 369 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 264 369 264 369 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
17 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/melania-trump/ https://www.whitehouse.gov/administration/melania- [object HTMLDivElement] 264 369 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 264 369 264 369 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
18 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 follow https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/the-cabinet/ https://www.whitehouse.gov/administration/the-cabi [object HTMLDivElement] 264 369 wp-block-cover__action wp-block-cover__action <a class="wp-block-cover__action" href="https://ww 264 369 264 369 [object HTMLSpanElement] [object HTMLDivElement] [object HTMLDivElement] [object HTMLSpanElement] follow _self () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
19 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Read More https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /issues/ https://www.whitehouse.gov/issues/ [object HTMLDivElement] 405 136 51 READ MORE READ MORE wp-block-button__link has-white-color has-text-col wp-block-button__link has-white-color has-text-col Read More <a class="wp-block-button__link has-white-color ha 132 47 132 47 [object HTMLDivElement] [object HTMLDivElement] [object Text] [object Text] Read More () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
20 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /about-the-white-house/the-white-house/ https://www.whitehouse.gov/about-the-white-house/t [object HTMLBodyElement] 3221 51 336 290 <img loading="lazy" decoding="async" width="336" h <a href="https://www.whitehouse.gov/about-the-whit 336 290 336 290 [object HTMLElement] [object HTMLElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] 1 () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
21 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /about-the-white-house/camp-david/ https://www.whitehouse.gov/about-the-white-house/c [object HTMLBodyElement] 3221 455 336 290 <img loading="lazy" decoding="async" width="336" h <a href="https://www.whitehouse.gov/about-the-whit 336 290 336 290 [object HTMLElement] [object HTMLElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] 1 () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
22 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /about-the-white-house/air-force-one/ https://www.whitehouse.gov/about-the-white-house/a [object HTMLBodyElement] 3221 860 336 290 <img loading="lazy" decoding="async" width="336" h <a href="https://www.whitehouse.gov/about-the-whit 336 290 336 290 [object HTMLElement] [object HTMLElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] [object HTMLImageElement] 1 () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
23 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 News https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /news/ https://www.whitehouse.gov/news/ [object HTMLBodyElement] 3731 36 39 36 News News News News</a 39 36 39 36 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] News () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
24 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Administration https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /administration/ https://www.whitehouse.gov/administration/ [object HTMLBodyElement] 3767 36 109 36 Administration Administration Administration <a href="https://www.whitehouse.gov/administration 109 36 109 36 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Administration () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
25 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Issues https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /issues/ https://www.whitehouse.gov/issues/ [object HTMLBodyElement] 3803 36 43 36 Issues Issues Issues Issue 43 36 43 36 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Issues () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
26 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Contact https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /contact/ https://www.whitehouse.gov/contact/ [object HTMLBodyElement] 3839 36 58 36 Contact Contact Contact Cont 58 36 58 36 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Contact () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
27 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Visit https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /visit/ https://www.whitehouse.gov/visit/ [object HTMLBodyElement] 3875 36 33 36 Visit Visit Visit Visit< 33 36 33 36 [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Visit () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
28 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow X https://twitter.com twitter.com twitter.com /whitehouse https://twitter.com/whitehouse [object HTMLBodyElement] 3861 1028 49 25 X X wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h 49 25 49 25 [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] X noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
29 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow Instagram https://www.instagram.com www.instagram.com www.instagram.com /whitehouse/ https://www.instagram.com/whitehouse/ [object HTMLBodyElement] 3861 1077 49 25 Instagram Instagram wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h 49 25 49 25 [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] Instagram noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
30 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 noopener nofollow Facebook https://www.facebook.com www.facebook.com www.facebook.com /WhiteHouse/ https://www.facebook.com/WhiteHouse/ [object HTMLBodyElement] 3861 1127 49 25 Facebook Facebook wp-block-social-link-anchor wp-block-social-link-anchor <svg width="24" height="24" viewBox="0 0 24 24" xm <a rel="noopener nofollow" target="_blank" href="h 49 25 49 25 [object HTMLLIElement] [object HTMLLIElement] [object SVGSVGElement] [object HTMLSpanElement] Facebook noopener nofollow [object SVGSVGElement] [object HTMLSpanElement] 2 _blank () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
31 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 home WH.GOV https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov / https://www.whitehouse.gov/ [object HTMLBodyElement] 4231 36 93 46 WH.GOV WH.GOV WH.GOV <a href="https://www.whitehouse.gov" rel="home" ti 93 46 93 46 [object HTMLDivElement] [object HTMLDivElement] [object Text] [object Text] [object Text] WH.GOV home The White House () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
32 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Copyright https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /copyright/ https://www.whitehouse.gov/copyright/ [object HTMLBodyElement] 4244 285 69 18 Copyright Copyright Copyright Co [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Copyright () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
33 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 Privacy https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov /privacy/ https://www.whitehouse.gov/privacy/ [object HTMLBodyElement] 4244 402 49 18 Privacy Privacy Privacy Priv [object HTMLLIElement] [object HTMLLIElement] [object Text] [object Text] Privacy () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()
34 <seleniumbase.undetected.webelement.WebElement (se mainframe 35 https://www.whitehouse.gov www.whitehouse.gov www.whitehouse.gov / #top https://www.whitehouse.gov/#top [object HTMLBodyElement] 4218 1159 88 72 <svg xmlns="http://www.w3.org/2000/svg" width="24" \n <svg x 88 72 88 72 [object HTMLDivElement] [object HTMLDivElement] [object Text] [object Text] [object Text] Back to top [object SVGSVGElement] [object SVGSVGElement] 1 () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () () None () C:\Program Files\Microsoft VS Code\seleniumpicture () A9CDA196DDDB8CEBE79EAB2ECF20D18C ()

About

Web Scraping with Selenium / SeleniumBase / Undetected Chromedriver and Cython+Pandas

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published