최신 Professional-Data-Engineer 무료덤프 - Google Certified Professional Data Engineer

Your team is responsible for developing and maintaining ETLs in your company. One of your Dataflow jobs is failing because of some errors in the input data, and you need to improve reliability of the pipeline (incl. being able to reprocess all failing data).
What should you do?

정답: D
설명: (DumpTOP 회원만 볼 수 있음)
After migrating ETL jobs to run on BigQuery, you need to verify that the output of the migrated jobs is the same as the output of the original. You've loaded a table containing the output of the original job and want to compare the contents with output from the migrated job to show that they are identical. The tables do not contain a primary key column that would enable you to join them together for comparison.
What should you do?

정답: D
You want to use Google Stackdriver Logging to monitor Google BigQuery usage. You need an instant notification to be sent to your monitoring tool when new data is appended to a certain table using an insert job, but you do not want to receive notifications for other tables. What should you do?

정답: A
Which of the following is NOT one of the three main types of triggers that Dataflow supports?

정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You work for a farming company. You have one BigQuery table named sensors, which is about 500 MB and contains the list of your 5000 sensors, with columns for id, name, and location. This table is updated every hour. Each sensor generates one metric every 30 seconds along with a timestamp. which you want to store in BigQuery. You want to run an analytical query on the data once a week for monitoring purposes. You also want to minimize costs. What data model should you use?

정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You are integrating one of your internal IT applications and Google BigQuery, so users can query BigQuery from the application's interface. You do not want individual users to authenticate to BigQuery and you do not want to give them access to the dataset. You need to securely access BigQuery from your IT application.
What should you do?

정답: A
When you design a Google Cloud Bigtable schema it is recommended that you _________.

정답: A
설명: (DumpTOP 회원만 볼 수 있음)
You are designing a data mesh on Google Cloud with multiple distinct data engineering teams building data products. The typical data curation design pattern consists of landing files in Cloud Storage, transforming raw data in Cloud Storage and BigQuery datasets. and storing the final curated data product in BigQuery datasets You need to configure Dataplex to ensure that each team can access only the assets needed to build their data products. You also need to ensure that teams can easily share the curated data product. What should you do?

정답: A
설명: (DumpTOP 회원만 볼 수 있음)
You issue a new batch job to Dataflow. The job starts successfully, processes a few elements, and then suddenly fails and shuts down. You navigate to the Dataflow monitoring interface where you find errors related to a particular DoFn in your pipeline. What is the most likely cause of the errors?

정답: D
설명: (DumpTOP 회원만 볼 수 있음)
Which of these are examples of a value in a sparse vector? (Select 2 answers.)

정답: B,D
설명: (DumpTOP 회원만 볼 수 있음)
You are using Workflows to call an API that returns a 1 KB JSON response, apply some complex business logic on this response, wait for the logic to complete, and then perform a load from a Cloud Storage file to BigQuery. The Workflows standard library does not have sufficient capabilities to perform your complex logic, and you want to use Python's standard library instead. You want to optimize your workflow for simplicity and speed of execution. What should you do?

정답: A
You are implementing workflow pipeline scheduling using open source-based tools and Google Kubernetes Engine (GKE). You want to use a Google managed service to simplify and automate the task. You also want to accommodate Shared VPC networking considerations. What should you do?

정답: D
설명: (DumpTOP 회원만 볼 수 있음)
Which is the preferred method to use to avoid hotspotting in time series data in Bigtable?

정답: B
설명: (DumpTOP 회원만 볼 수 있음)
Does Dataflow process batch data pipelines or streaming data pipelines?

정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You are working on a linear regression model on BigQuery ML to predict a customer's likelihood of purchasing your company's products. Your model uses a city name variable as a key predictive component in order to train and serve the model your data must be organized in columns. You want to prepare your data using the least amount of coding while maintaining the predictable variables. What should you do?

정답: D
You need (o give new website users a globally unique identifier (GUID) using a service that takes in data points and returns a GUID This data is sourced from both internal and external systems via HTTP calls that you will make via microservices within your pipeline There will be tens of thousands of messages per second and that can be multithreaded, and you worry about the backpressure on the system How should you design your pipeline to minimize that backpressure?

정답: B
What are the minimum permissions needed for a service account used with Google Dataproc?

정답: B
설명: (DumpTOP 회원만 볼 수 있음)
You have a Standard Tier Memorystore for Redis instance deployed in a production environment. You need to simulate a Redis instance failover in the most accurate disaster recovery situation, and ensure that the failover has no impact on production dat a. What should you do?

정답: D
설명: (DumpTOP 회원만 볼 수 있음)
You are designing a basket abandonment system for an ecommerce company. The system will send a message to a user based on these rules:
No interaction by the user on the site for 1 hour
Has added more than $30 worth of products to the basket
Has not completed a transaction
You use Google Cloud Dataflow to process the data and decide if a message should be sent. How should you design the pipeline?

정답: A

우리와 연락하기

문의할 점이 있으시면 메일을 보내오세요. 12시간이내에 답장드리도록 하고 있습니다.

근무시간: ( UTC+9 ) 9:00-24:00
월요일~토요일

서포트: 바로 연락하기