Big Data Online Practice Test - 2

This Test will cover complete Big Data with very important questions, starting off from basics to advanced level.

Q. In a quiz,the following code snippet,was given:

		
object MyClass {
      def main(args: Array[String]) {
         var m1=Map("Zootopia"->4,"Toy Story"->3);
         print(m1.apply("Zootopia"));
        print(m1.apply("Kung Fu Panda")) 
      }
   }

Mark the valid option.

A. Zootopia, null

B. Zootopia,Kung Fu Panda

C. 4,0

D. 4 , NoSuchElementException.

Q. A developer was told,to write the following MongoDb query:
Display PortName and NumberOfShips,from Ports collection,where _id is 4, but,_id should not be displayed,in the result.
Mark the correct syntax for the same

A. db.Ports.find({_id:4},{PortName:0,NumberOfShips:0,_id:-1});

B. db.Ports.find({_id:4},{PortName:1,NumberOfShips:1,_id:-1});

C. db.Ports.find({_id:4},{PortName:1,NumberOfShips:1,_id:0});

D. db.Ports.find({_id:4},{PortName:1,NumberOfShips:2,_id:0});

Q. In an interview,it was asked to mark,if the following statements are true or false:
1) Every Hadoop cluster,has only one Job Tracker daemon.
2) One slave node , has one Task Tracker.
3) One Hadoop cluster,can have,only one Namenode.
4) For a 314Mb file,where default block size is 64Mb with replication factor 3,the total number of blocks created are 5.
Choose the correct option.

A. False, True, True, True

B. True, True, True, False

C. True, False, False, False

D. False, False, True, True

Q. Below are a few features mentioned,of the Flume channels.
1) Data is lost if there is a power failure.
2) It is better for web server logs.
3) Events are stored in a persistent storage.
4) Events can be stored in a Kafka cluster.
Mark the names with their descriptions.

A. Spillable Memory, File, JDBC, PseudoKafka

B. File, Memory, Spillable Memory, Custom.

C. Memory, Memory, JDBC, Kafka

D. Spillable Memory, File, Pseudo JDBC, CustomKafka

This is for Tests

Q. Following are,descriptions,of building blocks of Kafka. Map them with their names:
1) Data is stored in them. They are a stream of messages.
2) They are publisher of messages.
3) They help in maintaining published data.
4) They handle all reads and writes,for given partition.
Mark the appropriate option.

A. Topics, Producer, Broker, Leader

B. Partition, Publisher, Consumer, Follower

C. Leader, Topics, Partition, Consumer

D. Consumer, Publisher, Topics, Topics

Q. A developer,was told to change,the replication factor of a directory,from 3 to 4. He set the configuration, dfs.replication,in the hdfs-site.xml,to 4. Then,he created some new files,in the directory. Consider the following points,and,mark the correct statement:

A. The replication factor 4, will affect the newly created,as well as,existing files in the directory.

B. The replication factor 4, will affect the only existing files.The newly created ones,will be replicated,by value 3.

C. The replication factor 4, will affect,the newly created files only.

D. All the files,will be replicated,by default value 3. For replication factor 4,to get set, one should run,the Hadoop CLI command -setrep.

Q.A group of leads,were involved in a discussion,about data locality in Hadoop:
Lead A- It is not always possible,to move algorithms close to the data. So,the data should be brought,closer to them. This minimizes network congestion.
Lead B- It is much more efficient,if the algorithms are brought closer to the data. Though,this decreases the throughput,it minimizes the network congestion problem.Inter-rack locality is most preferred scenario.
Lead C- Inter-rack scenario is least preferred scenario.Data local data locality,is most preferred,as data is on the same node,as the mapper.
Mark the correct option.

A. Lead C is totally correct.

B. Lead B is totally correct,while Lead A is totally wrong.

C. All leads are partially correct.

D. None of the leads are correct.

Q. I present the developers, a ready-to-use framework,which allows them,to perform data mining of massive amounts of data:
My algorithms are written,on top of Hadoop,so I work well in distributed environment.
I am mainly used,for creating many machine learning algorithms.
I consists of multiple,matrix and vector libraries.
Mark the framework.

A. Thrift

B. Mahout

C. Drill

D. Avro

Q. In a quiz,freshers were provided following Multiple Choice Questions related to Hive:
1) Bucketed tables are not stored as a file.
2) Sub-queries,are supported in Hive,only in _____ clause.
3) Custom types and functions can be defined in Hive.
4) HQL allows, downloading the contents of a table,to a local directory.
Mark the correct option.

A. False, ORDER BY, True, True

B. True, GROUP BY, True, False

C. True, HAVING, False, False

D. False, FROM, True, True

This is for Tests

Q. In an interview,a fresher was asked to mention a features of Spark transformations and actions. She mixed them up in her answer:
1) They are evaluated on demand.
2) From existing RDDs , new RDDs are created.
3) To load data into original RDD, they trigger, a lineage graph.
4) They return the final values,of the RDD computation.
Mark which of them belong to Transformation(T) or Action(A).

A. A, T, T, A	B. T, A, A, T
C. T, T, A, A	D. A, A, T, T

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Big Data Online Practice Test - 2

Related Tests: