geonode.harvesting.harvesters.geonodeharvester ============================================== .. py:module:: geonode.harvesting.harvesters.geonodeharvester .. autoapi-nested-parse:: Harvesters GeoNode remote servers. Attributes ---------- .. autoapisummary:: geonode.harvesting.harvesters.geonodeharvester.logger Classes ------- .. autoapisummary:: geonode.harvesting.harvesters.geonodeharvester.GeoNodeDatasetType geonode.harvesting.harvesters.geonodeharvester.RemoteDatasetType geonode.harvesting.harvesters.geonodeharvester.GeoNodeResourceType geonode.harvesting.harvesters.geonodeharvester.GeoNodeResourceTypeCurrent geonode.harvesting.harvesters.geonodeharvester.GeonodeCurrentHarvester geonode.harvesting.harvesters.geonodeharvester.GeonodeLegacyHarvester geonode.harvesting.harvesters.geonodeharvester.GeonodeUnifiedHarvesterWorker Functions --------- .. autoapisummary:: geonode.harvesting.harvesters.geonodeharvester.get_contact_descriptor geonode.harvesting.harvesters.geonodeharvester.get_identification_descriptor geonode.harvesting.harvesters.geonodeharvester._get_native_format geonode.harvesting.harvesters.geonodeharvester.get_spatial_extent_4326 geonode.harvesting.harvesters.geonodeharvester.get_spatial_extent_native geonode.harvesting.harvesters.geonodeharvester.get_temporal_extent geonode.harvesting.harvesters.geonodeharvester._get_optional_attribute_value geonode.harvesting.harvesters.geonodeharvester._get_extra_config_schema geonode.harvesting.harvesters.geonodeharvester._from_django_record geonode.harvesting.harvesters.geonodeharvester._check_availability Module Contents --------------- .. py:data:: logger .. py:class:: GeoNodeDatasetType Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: VECTOR :value: 'vector' .. py:attribute:: RASTER :value: 'raster' .. py:class:: RemoteDatasetType Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: VECTOR :value: 'shapefile' .. py:attribute:: RASTER :value: 'geotiff' .. py:class:: GeoNodeResourceType Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: DOCUMENT :value: 'documents' .. py:attribute:: DATASET :value: 'layers' .. py:attribute:: MAP :value: 'maps' .. py:class:: GeoNodeResourceTypeCurrent Bases: :py:obj:`enum.Enum` Generic enumeration. Derive from this class to define new enumerations. .. py:attribute:: DOCUMENT :value: 'document' .. py:attribute:: DATASET :value: 'dataset' .. py:class:: GeonodeCurrentHarvester(*args, harvest_documents: Optional[bool] = True, harvest_datasets: Optional[bool] = True, copy_datasets: Optional[bool] = False, copy_documents: Optional[bool] = False, resource_title_filter: Optional[str] = None, start_date_filter: Optional[str] = None, end_date_filter: Optional[str] = None, keywords_filter: Optional[List[str]] = None, categories_filter: Optional[List[str]] = None, **kwargs) Bases: :py:obj:`geonode.harvesting.harvesters.base.BaseHarvesterWorker` A harvester for modern (v3.2+) GeoNode versions. GeoNode versions above 3.2 introduced the concept of `datasets` to replace the older `layers` concept. The API also has some significative differences. .. py:attribute:: harvest_documents :type: bool .. py:attribute:: harvest_datasets :type: bool .. py:attribute:: harvest_maps :type: bool :value: False .. py:attribute:: copy_documents :type: bool .. py:attribute:: copy_datasets :type: bool .. py:attribute:: resource_title_filter :type: Optional[str] .. py:attribute:: start_date_filter :type: Optional[str] .. py:attribute:: end_date_filter :type: Optional[str] .. py:attribute:: keywords_filter :type: Optional[List[str]] .. py:attribute:: categories_filter :type: Optional[List[str]] .. py:attribute:: http_session :type: requests.Session .. py:attribute:: page_size :type: int :value: 10 .. py:attribute:: remote_url .. py:property:: base_api_url .. py:property:: allows_copying_resources :type: bool .. py:method:: from_django_record(record: geonode.harvesting.models.Harvester) :classmethod: .. py:method:: get_extra_config_schema() -> Dict :classmethod: .. py:method:: get_num_available_resources() -> int .. py:method:: list_resources(offset: Optional[int] = 0) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: check_availability(timeout_seconds: Optional[int] = 5) -> bool .. py:method:: get_geonode_resource_type(remote_resource_type: str) -> Type[Union[geonode.layers.models.Dataset, geonode.documents.models.Document]] .. py:method:: get_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Optional[geonode.harvesting.harvesters.base.HarvestedResourceInfo] .. py:method:: should_copy_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> bool .. py:method:: get_geonode_resource_defaults(harvested_info: geonode.harvesting.harvesters.base.HarvestedResourceInfo, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Dict .. py:method:: _get_contact_descriptor(role, contact_details: Dict) .. py:method:: _get_related_name(contact_details: Dict) .. py:method:: _get_document_link_info(resource: Dict) .. py:method:: _get_dataset_link_info(resource: Dict, spatial_extent: django.contrib.gis.geos.Polygon) .. py:method:: _get_resource_link_info(resource: Dict, remote_resource_type: str, spatial_extent: django.contrib.gis.geos.Polygon) -> Tuple[str, str, str, str, Optional[str], Optional[str], Optional[str]] .. py:method:: _get_resource_descriptor(raw_resource: Dict, remote_resource_type: str) -> geonode.harvesting.resourcedescriptor.RecordDescription .. py:method:: _get_resource_list_params(offset: Optional[int] = 0) -> Dict .. py:class:: GeonodeLegacyHarvester(*args, harvest_documents: Optional[bool] = True, harvest_datasets: Optional[bool] = True, copy_datasets: Optional[bool] = False, copy_documents: Optional[bool] = False, resource_title_filter: Optional[str] = None, start_date_filter: Optional[str] = None, end_date_filter: Optional[str] = None, keywords_filter: Optional[list] = None, categories_filter: Optional[list] = None, **kwargs) Bases: :py:obj:`geonode.harvesting.harvesters.base.BaseHarvesterWorker` A harvester for older (v <= 3.2) GeoNode versions .. py:attribute:: harvest_documents :type: bool .. py:attribute:: harvest_datasets :type: bool .. py:attribute:: harvest_maps :type: bool :value: False .. py:attribute:: copy_documents :type: bool .. py:attribute:: copy_datasets :type: bool .. py:attribute:: resource_title_filter :type: Optional[str] .. py:attribute:: http_session :type: requests.Session .. py:attribute:: page_size :type: int :value: 10 .. py:attribute:: remote_url .. py:attribute:: start_date_filter .. py:attribute:: end_date_filter .. py:attribute:: keywords_filter .. py:attribute:: categories_filter .. py:property:: base_api_url .. py:property:: allows_copying_resources :type: bool .. py:method:: from_django_record(record: geonode.harvesting.models.Harvester) :classmethod: .. py:method:: get_extra_config_schema() -> Dict :classmethod: .. py:method:: get_num_available_resources() -> int .. py:method:: list_resources(offset: Optional[int] = 0) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: check_availability(timeout_seconds: Optional[int] = 5) -> bool Check whether the remote GeoNode is online. .. py:method:: get_geonode_resource_type(remote_resource_type: str) -> Type[Union[geonode.layers.models.Dataset, geonode.documents.models.Document, geonode.maps.models.Map]] Return resource type class from resource type string. .. py:method:: get_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Optional[geonode.harvesting.harvesters.base.HarvestedResourceInfo] .. py:method:: should_copy_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> bool .. py:method:: get_geonode_resource_defaults(harvested_info: geonode.harvesting.harvesters.base.HarvestedResourceInfo, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Dict .. py:method:: _get_num_available_resources_by_type() -> Dict[GeoNodeResourceType, int] .. py:method:: _list_document_resources(offset: int) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: _list_dataset_resources(offset: int) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: _list_map_resources(offset: int) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: _list_resources_by_type(resource_type: GeoNodeResourceType, offset: int) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: _extract_unique_identifier(raw_remote_resource: Dict) -> str .. py:method:: _get_resource_details(api_record: Dict, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Optional[geonode.harvesting.resourcedescriptor.RecordDescription] Produce a record description from the response provided by the remote GeoNode. .. py:method:: _get_resource_list_params(offset: Optional[int] = 0) -> Dict .. py:method:: _get_total_records(resource_type: GeoNodeResourceType) -> int .. py:method:: _get_resource_descriptor(csw_record: lxml.etree.Element, api_record: Dict, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> geonode.harvesting.resourcedescriptor.RecordDescription .. py:method:: _get_dataset_additional_parameters(descriptor: geonode.harvesting.resourcedescriptor.RecordDescription, api_record: Dict) -> Dict .. py:method:: _get_document_additional_parameters(descriptor: geonode.harvesting.resourcedescriptor.RecordDescription, api_record: Dict) -> Dict .. py:method:: _get_map_additional_parameters(descriptor: geonode.harvesting.resourcedescriptor.RecordDescription, api_record: Dict) -> Dict .. py:method:: get_distribution_info(csw_distribution: lxml.etree.Element, api_record: Dict, harvestable_resource: geonode.harvesting.models.HarvestableResource, identification_descriptor: geonode.harvesting.resourcedescriptor.RecordIdentification, crs: str) -> geonode.harvesting.resourcedescriptor.RecordDistribution .. py:method:: _retrieve_thumbnail_url(api_record: Dict, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> str .. py:class:: GeonodeUnifiedHarvesterWorker(*args, harvest_documents: Optional[bool] = True, harvest_datasets: Optional[bool] = True, copy_datasets: Optional[bool] = False, copy_documents: Optional[bool] = False, resource_title_filter: Optional[str] = None, start_date_filter: Optional[str] = None, end_date_filter: Optional[str] = None, keywords_filter: Optional[List[str]] = None, categories_filter: Optional[List[str]] = None, **kwargs) Bases: :py:obj:`geonode.harvesting.harvesters.base.BaseHarvesterWorker` A harvester worker that is able to retrieve details from most GeoNode deployments. This harvester type relies on the `GeonodeCurrentHarvester` and `GeonodeLegacyHarvester` for most operations. It simply determines which concrete harvester to use based on the remote's response for the availability check and then uses it. .. py:attribute:: _concrete_harvester_worker :type: Optional[Union[GeonodeCurrentHarvester, GeonodeLegacyHarvester]] .. py:attribute:: remote_url .. py:attribute:: http_session .. py:attribute:: harvest_documents .. py:attribute:: harvest_datasets .. py:attribute:: copy_datasets .. py:attribute:: copy_documents .. py:attribute:: resource_title_filter .. py:attribute:: start_date_filter .. py:attribute:: end_date_filter .. py:attribute:: keywords_filter .. py:attribute:: categories_filter .. py:property:: concrete_worker :type: Union[GeonodeCurrentHarvester, GeonodeLegacyHarvester] .. py:property:: allows_copying_resources :type: bool .. py:method:: from_django_record(record: geonode.harvesting.models.Harvester) :classmethod: .. py:method:: get_extra_config_schema() -> Dict :classmethod: .. py:method:: get_num_available_resources() -> int .. py:method:: list_resources(offset: Optional[int] = 0) -> List[geonode.harvesting.harvesters.base.BriefRemoteResource] .. py:method:: check_availability(timeout_seconds: Optional[int] = 5) -> bool .. py:method:: get_geonode_resource_type(remote_resource_type: str) -> Type[Union[geonode.layers.models.Dataset, geonode.documents.models.Document]] .. py:method:: get_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Optional[geonode.harvesting.harvesters.base.HarvestedResourceInfo] .. py:method:: should_copy_resource(harvestable_resource: geonode.harvesting.models.HarvestableResource) -> bool .. py:method:: get_geonode_resource_defaults(harvested_info: geonode.harvesting.harvesters.base.HarvestedResourceInfo, harvestable_resource: geonode.harvesting.models.HarvestableResource) -> Dict .. py:method:: _get_concrete_worker() -> Union[GeonodeCurrentHarvester, GeonodeLegacyHarvester] .. py:function:: get_contact_descriptor(contact: lxml.etree.Element) .. py:function:: get_identification_descriptor(csw_identification: lxml.etree.Element, api_record: Dict) .. py:function:: _get_native_format(csw_identification: lxml.etree.Element, api_record: Dict) -> Optional[str] .. py:function:: get_spatial_extent_4326(identification_el: lxml.etree.Element) -> Optional[django.contrib.gis.geos.Polygon] .. py:function:: get_spatial_extent_native(api_record: Dict) .. py:function:: get_temporal_extent(identification_el: lxml.etree.Element) -> Optional[Tuple[datetime.datetime, datetime.datetime]] .. py:function:: _get_optional_attribute_value(element: lxml.etree.Element, xpath: str) -> Optional[str] .. py:function:: _get_extra_config_schema() -> Dict .. py:function:: _from_django_record(target_class: Type, record: geonode.harvesting.models.Harvester) .. py:function:: _check_availability(http_session, url: str, payload_key_to_check: str, timeout_seconds: Optional[int] = 5) -> bool