QUESTION 81 Which key role for a successful analytic project can provide business domain expertise with a deep understanding of the data and key performance indicators?   A. Business Intelligence Analyst B. Project Manager C. Project Sponsor D. Business User   Correct Answer: A       QUESTION 82 Refer to the exhibit. You are using K-means clustering to classify customer behavior for a large retailer. You need to determine the optimum number
QUESTION 91 Consider these itemsets:   (hat, scarf, coat)   (hat, scarf, coat, gloves)   (hat, scarf, gloves)   (hat, gloves)   (scarf, coat, gloves)   What is the confidence of the rule (hat, scarf) -> gloves?   A. 66% B. 40% C. 50% D. 60%   Correct Answer: A     QUESTION 92 Refer to the exhibit. You have run a linear regression model against your data, and have plotted true outcome versus predicted outcome. The
QUESTION 101 What is required in a presentation for business analysts?   A. Budgetary considerations and requests B. Operational process changes C. Detailed statistical explanation of the applicable modeling theory D. The presentation author's credentials   Correct Answer: B     QUESTION 102 You are using MADlib for Linear Regression analysis . Which value does the statement return?   SELECT (linregr(depvar, indepvar)).r2 FROM zeta1;   A.
QUESTION 61 What would be considered "Big Data"?   A. An OLAP Cube containing customer demographic information about 100, 000, 000 customers B. Daily Log files from a web server that receives 100, 000 hits per minute C. Aggregated statistical data stored in a relational database table D. Spreadsheets containing monthly sales data for a Global 100 corporation   Correct Answer: B     QUESTION 62 Since R factors are categorical variables, they are most
QUESTION 71 What is an appropriate data visualization to use in a presentation for an analyst audience?   A. Pie chart B. Area chart C. Stacked bar chart D. ROC curve   Correct Answer: D     QUESTION 72 Which process in text analysis can be used to reduce dimensionality?   A. Stemming B. Parsing C. Digitizing D. Sorting   Correct Answer: A     QUESTION 73 When creating a presentation for a technical audience, what is the
QUESTION 41 You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?   A. Identify additional measures to add to the analysis B. Remove one of the measures C. Decrease the number of clusters D. Increase the number of clusters
QUESTION 51 Which word or phrase completes the statement? A Data Scientist would consider that a RDBMS is to a Table as R is to a ______________ .   A. Data frame B. List C. Matrix D. Array   Correct Answer: A                 QUESTION 52 Refer to the exhibit. In the exhibit, the x-axis represents the derived probability of a borrower defaulting on a loan. Also in the exhibit, the pink represents borrowers that are known
QUESTION 31 Refer to the exhibit. You are asked to write a report on how specific variables impact your client's sales using a data set provided to you by the client. The data includes 15 variables that the client views as directly related to sales, and you are restricted to these variables only.   After a preliminary analysis of the data, the following findings were made:   1. Multicollinearity is not an issue among the variables 2. Only three variables-A, B, and C-have
QUESTION 11 Which data asset is an example of quasi-structured data?   A. Webserver log B. XML data file C. Database table D. News article   Correct Answer: A     QUESTION 12 What is required in a presentation for project sponsors?   A. The "Big Picture" takeaways for executive level stakeholders B. Data warehouse design changes C. Line by line review of the developed code D. Detailed statistical basis for the modeling approach
QUESTION 21 You are given 10, 000, 000 user profile pages of an online dating site in XML files, and they are stored in HDFS. You are assigned to divide the users into groups based on the content of their profiles. You have been instructed to try K-means clustering on this data. How should you proceed?   A. Run MapReduce to transform the data, and find relevant key value pairs. B. Divide the data into sets of 1, 000 user profiles, and run K-means clustering in RHadoop
QUESTION 1 Which word or phrase completes the statement? Structured data is to OLAP data as quasi- structured data is to____   A. Clickstream data B. XML data C. Text documents D. Image files   Correct Answer: A     QUESTION 2 Refer to the exhibit. Click on the calculator icon in the upper left corner. You are going into a meeting where you k now your manager will have a question on your dataset -- specifically relating to customers that are classified