Big Data Online Practice Test - 1

This Test will cover complete Big Data with very important questions, starting off from basics to advanced level.

Q. A lead was explaining,the concept of a reducer, to his team. Which of the following,correctly describes it:
1) Shuffle and Sort are a part of reducer. They occur simultaneously
2) Just like a mapper, a reduce only job,is also possible,in any situation
3) Hadoop will allow,reduce only job,if shuffle and sort,are emitted.
4) The output of the reducer,is the final output, which is stored in HDFS.
Mark the valid option.

A. Statements 1, 3 and 4 are correct.

B. Statements 1 and 4 are correct.

C. Statements 1,2 and 4 are correct.

D. All the statements are correct.

Q. Which of the following statements are correct, regarding the description of output format,of the Mapreduce jobs:
1) If the last output filename,of a job is test-r-00025,it means there were 26 reducers.
2) result-r-00001 means, its a reducer output.For a map only job the 'r' is replaced by 'm'.
3) By setting configuration, mapreduce.set.outputfilename,one can change the output file name.
4) If the format of output file,is part-r-yyyyy,then the yyyyy represents the,task number.
Mark the valid option.

A. Only 3rd statement is incorrect.Rest all are valid

B. 1st and 4th are incorrect statements.

C. Only 4th is correct. Rest are wrong statements.

D. 2nd and 4th statements are correct. Other two are incorrect.

Q. A lead was training his team,about Hive Serialization.Which of the following points,are valid or invalid:
1) SerDe is a library,which is built-in,in the Hadoop API.
2) Hive gets to know,how to process a record, through the Serializer or Deserializer.
3) To read write delimited records,such as Control A separated records, one can use SerDe library.
4) Hive uses, MetadataTypedColumnsetSerDe, class to Serialize or Deserialize data.
Mark the valid option.

A. 1st statement is wrong as SerDe is not a Hadoop library.

B. Control A separated records cannot be read or written using SerDe. So 3rd statement is wrong.

C. All the statements are valid.

D. 4th statement is wrong,as the class,used by Hive,is MetadataTypedColumnSerDe.

Q. Below are few points about Pig framework.Mention if its an advantage(A) or disadvantage(D):
1) In Pig, commands are not executed unless they are stored or dumped.
2) Pig does not enforce an explicit schema.
3) In Pig, it is possible to control,execution,of every step.
4) In Pig, program does not get evaluated, unless it gives an output file or outputs any message.
Mark the correct option.

A. A, D, D, A	B. D, D, A, A
C. D, A, A, D	D. D, A, D, A

This is for Tests

Q. In an interview following statements were mentioned. Mark them as True or false:
1) Hive has high latency as compared to HBase.
2) HBase uses SQL while Hive uses HQL.
3) HBase supports secondary indexes.
4) HBase supports real time processing.
Mark the proper option.

A. True, False, False, True

B. True, False, True, False

C. True, False, True, True

D. False, False, True, False

Q. A developer,working on Mongodb,was told to write the following queries:
1) Update document with _id:3 in movies collection and add name "zootopia" to animation array.
2) To find documents from movies collection where Genre is not set to 'A'.
Mark the correct option.

A. 1) db.movies.update({_id:3},{add:{animation:"zootopia"}}); 2) db.movies.find({Genre:{$nset:'A'}}).pretty();

B. 1) db.movies.add({_id:3},{updateToSet:{animation:"zootopia"}}); 2) db.movies.find({Genre:{$skip:'A'}}).pretty();

C. 1) db.movies.update({_id:3},{addToSet:{animation:"zootopia"}}); 2) db.movies.find({Genre:{$ne:'A'}}).pretty();

D. 1) db.movies.add({_id:3},{appendToSet:{animation:"zootopia"}}); 2) db.movies.find({Genre:{$neq:'A'}}).pretty();

Q. A developer was working on Scala code snippets.He was asked to write them for following statements:
1) Iterate over each element in tuple.
2) Add the values in tuple.
Tuple given was: val ttuple= (5,2,9,1).
Mark the correct syntax.

A. 1) ttuple.productIterator.foreach{ i => println("Value ="+ i )} 2) val sum = ttuple._1 + ttuple._2 + ttuple._3 + ttuple._4; println("Addition of elements:"+ sum )

B. 1) ttuple.productIterator.foreach{ i => println("Value ="+ i )} 2) val sum = ttuple._1 + ttuple._2 + ttuple._3 + ttuple._4; println("Addition of elements:"+ sum )

C. 1) ttuple.iteratorProduct.foreach{ i => println("Value ="+ i )} 2) val sum = ttuple[0] + ttuple[1]+ ttuple[2]+ ttuple[3]; println("Addition of elements:"+ sum )

D. 1) ttuple.foreach{ i =>println("Value ="+ i )} 2) val sum = ttuple[0] + ttuple[1]+ ttuple[2]+ ttuple[3]; println("Addition of elements:"+ sum )

Q. A fresher was told to write a Pig query, for a relation of nations. He was supposed to partition them,based on their GDP rank.
Which of the following query,best suits the requirement?

A. A = load '/testdemo/nations.tsv' as (id:int,name:chararray,gdprank:int); SPLIT A INTO M IF gdprank <=10, N IF gdprank > 10; DUMP M;

B. A = load '/testdemo/nations.tsv' as (id:int,name:chararray,gdprank:int); SLICE A INTO M IF gdprank <=10, N IF gdprank > 10; DUMP M;

C. A = load '/testdemo/nations.tsv' as (id:int,name:chararray,gdprank:int); DUMP A INTO M IF gdprank <=10, N IF gdprank > 10; DUMP M;

D. A = load '/testdemo/nations.tsv' as (id:int,name:chararray,gdprank:int); PUT A INTO M IF gdprank <=10, N IF gdprank > 10; DUMP M;

Q. A developer was given some scenario's,and, was required to analyse, whether to use Mapreduce or Pig in them:
1) Definite driver control program is needed.
2) Code requires less debugging.
3) Output of one job, is an input to another job.
4) Job consists usage of distributed cache, cross products and joins.
Mark the correct option about the scenario's.

A. Mapreduce, Pig, Mapreduce, Mapreduce.

B. Pig, Pig, Mapreduce, Mapreduce.

C. Pig, Mapreduce, Pig, Pig.

D. Mapreduce, Pig, Mapreduce, Pig.

This is for Tests

Q. A developer was learning,the Cassandra data model.In the tutorial,some of the points mentioned,were as follows. Mark the ones,that are correct / incorrect:
1) While using Cassandra,for production usage, one should generally work with,Network Topology strategy.
2) There is at least one column family,per keyspace.
3) The rows in each column,are itself a collection,of many columns.
4) The Cassandra column families,can be changed,as they are not predefined.
Choose the appropriate option.

A. Correct, InCorrect, InCorrect, Correct	B. Correct, Correct, Correct, InCorrect
C. Correct, InCorrect, InCorrect, InCorrect	D. InCorrect, Correct, Correct, InCorrect

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Big Data Online Practice Test - 1

Related Tests: